types#

class pdcleaner.detection.types.types(obj, detector=None, ptype=None)[source]#

Bases: _SeriesDetector

Detect elements with type errors.

Intended to be used by the detect method with the keyword ‘types’

>>> series.cleaner.detect.types(...)
>>> series.cleaner.detect('types',...)

This detection method flags elements as potential errors wherever the corresponding python type is different than the one specified.

If no type is given, elements which don’t share the type of the first row are flagged as errors.

Note

NA values are not treated as errors.

Parameters:

ptype (python built-in data type or None (Default)) – int, float, str, bool …

Raises:

TypeError – when the given does not define a valid python built-in data type

Examples

>>> import pandas as pd
>>> import pdcleaner
>>> series = pd.Series([1, 2, 100, 3], dtype='float64')
>>> series[1] = 'One'
>>> detector = series.cleaner.detect.dtype(ptype=float)
>>> print(detector.is_error())
0    False
1     True
2    False
3    False
dtype: bool

Missing values are not treated as errors.

>>> series = pd.Series([1., 2., np.nan, 3.])
>>> series[1] = 'One'
>>> series[2] = np.nan
>>> detector = series.cleaner.detect.type(ptype=int)
>>> print(detector.is_error())
0    False
1     True
2    False
3    False
dtype: bool

If no type is specified, find elements whose types differ from the first one

>>> series = pd.Series(['A', 2, np.nan, 'D'])
>>> detector = series.cleaner.detect('type')
>>> type(series[0])
str
>>>print(detector.ptype)
str
>>> print(detector.is_error())
0    False
1     True
2    False
3    False
dtype: bool

The first detector detects the right type as str

>>> series = pd.Series(['A', 2, np.nan, 'D'])
>>> series_test = pd.Series([1, 'Two'])
>>> detector = series.cleaner.detect('type')
>>> second_detector = series_test.cleaner.detect(detector)
>>> print(second_detector.is_error())
0     True
1    False
dtype: bool

Attributes Summary

index

Indices of the rows detected as errors

n_errors

Number of rows detected as errors

name

obj

The object (Series or DataFrame) containing the data to which the detection is applied

ptype

built-in python type

Methods Summary

detected()

Series or DataFrame containing only the detected errors

has_errors()

Returns True if any error has been detected, False otherwise

is_error()

Return a boolean same-sized object indicating if the values are flagged as errors

not_error()

Return a boolean same-sized object indicating if the values are NOT flagged as errors

report()

prints a detection report

valid()

Series or DataFrame containing only the valid values

Attributes Documentation

index#

Indices of the rows detected as errors

n_errors#

Number of rows detected as errors

name = 'types'#
obj#

The object (Series or DataFrame) containing the data to which the detection is applied

ptype#

built-in python type

Methods Documentation

detected()#

Series or DataFrame containing only the detected errors

has_errors() bool#

Returns True if any error has been detected, False otherwise

is_error() Series#

Return a boolean same-sized object indicating if the values are flagged as errors

not_error() Series#

Return a boolean same-sized object indicating if the values are NOT flagged as errors

report()#

prints a detection report

valid()#

Series or DataFrame containing only the valid values