custom#
- class pdcleaner.detection.basic.custom(obj, detector=None, error_func=None)[source]#
Bases:
_DetectorDetect errors using an user-defined callable
Intended to be used by the detect method with the keyword ‘custom’
>>> df.cleaner.detect.custom(...) >>> df.cleaner.detect('custom',...)
- Parameters:
error_func (Callable) – returns a boolean: True if the element/row is an error, False otherwise
- Raises:
TypeError: – when error_func is not a callable
ValueError – when the number of arguments of error_func does not match the number of columns
TypeError: – when error_func does not return a boolean
Examples
>>> import pandas as pd >>> import pdcleaner
with a lambda function
>>> series = pd.Series([-1, 2, 3]) >>> detector = series.cleaner.detect('custom', error_func=lambda x: x<0) >>> print(detector.is_error()) 0 True 1 False 2 False dtype: bool
with a function
>>> def f(x) -> bool: if x**2 > 5: return True return False >>> detector = series.cleaner.detect('custom', error_func=f) >>> print(detector.is_error()) 0 False 1 True 2 True dtype: bool
with a dataframe, the callable should have the same number of inputs as the df.
>>> df = pd.DataFrame({'col1' : [1,2,3], 'col2' : [1,3,9] }) >>> bad_square = lambda x,y: x**2!=y >>> df.cleaner.detect('custom', error_func=bad_square).is_error() 0 False 1 True 2 False dtype: bool
Attributes Summary
Custom error function
Indices of the rows detected as errors
Number of rows detected as errors
The object (Series or DataFrame) containing the data to which the detection is applied
Methods Summary
detected()Series or DataFrame containing only the detected errors
Returns True if any error has been detected, False otherwise
is_error()Return a boolean same-sized object indicating if the values are flagged as errors
Return a boolean same-sized object indicating if the values are NOT flagged as errors
report()prints a detection report
valid()Series or DataFrame containing only the valid values
Attributes Documentation
- error_func#
Custom error function
- index#
Indices of the rows detected as errors
- n_errors#
Number of rows detected as errors
- name = 'custom'#
- obj#
The object (Series or DataFrame) containing the data to which the detection is applied
Methods Documentation
- detected()#
Series or DataFrame containing only the detected errors
- has_errors() bool#
Returns True if any error has been detected, False otherwise
- is_error() Series#
Return a boolean same-sized object indicating if the values are flagged as errors
- not_error() Series#
Return a boolean same-sized object indicating if the values are NOT flagged as errors
- report()#
prints a detection report
- valid()#
Series or DataFrame containing only the valid values