missing#

class pdcleaner.detection.basic.missing(obj, detector=None, how='any')[source]#

Bases: _Detector

Detect elements containing missing values

Intended to be used by the detect method with the keyword ‘missing’

>>> df.cleaner.detect.missing(...)
>>> df.cleaner.detect('missing',...)
Parameters:

how (string , default = 'any') –

  • ‘any’ : detected as error if any NA values are present.

  • ’all’ : detected as error if all values are NA.

Raises:

ValueError – when unknown value is given to how parameter

Examples

>>> import pandas as pd
>>> import pdcleaner
>>> df = pd.DataFrame({'col1' : ['Alice', 'Bob', 'Charles'],
                       'col2' : [15, np.nan, 11] })
>>> detector = df.cleaner.detect.missing(how='any')
>>> print(detector.is_error())
0    False
1    True
2    False
dtype: bool

Checking if all values are NA

>>> df = pd.DataFrame({'col1' : ['Alice', np.nan, 'Charles'],
                       'col2' : [np.nan, np.nan, np.nan] })
>>> detector = df.cleaner.detect.missing(how='all')
>>> print(detector.is_error())
0    False
1    True
2    False
dtype: bool

Can be used with series. ‘how’ parameter is not necessary

>>> series = pd.Series(['Alice', 'Bob', np.nan, 'Charles'])
>>> detector = series.cleaner.detect('missing')
>>> print(detector.is_error())
0    False
1    False
2     True
3    False
dtype: bool

Attributes Summary

how

Checking mode

index

Indices of the rows detected as errors

n_errors

Number of rows detected as errors

name

obj

The object (Series or DataFrame) containing the data to which the detection is applied

obj_type

Type of object

Methods Summary

detected()

Series or DataFrame containing only the detected errors

has_errors()

Returns True if any error has been detected, False otherwise

is_error()

Return a boolean same-sized object indicating if the values are flagged as errors

not_error()

Return a boolean same-sized object indicating if the values are NOT flagged as errors

report()

prints a detection report

valid()

Series or DataFrame containing only the valid values

Attributes Documentation

how#

Checking mode

index#

Indices of the rows detected as errors

n_errors#

Number of rows detected as errors

name = 'missing'#
obj#

The object (Series or DataFrame) containing the data to which the detection is applied

obj_type#

Type of object

Methods Documentation

detected()#

Series or DataFrame containing only the detected errors

has_errors() bool#

Returns True if any error has been detected, False otherwise

is_error() Series#

Return a boolean same-sized object indicating if the values are flagged as errors

not_error() Series#

Return a boolean same-sized object indicating if the values are NOT flagged as errors

report()#

prints a detection report

valid()#

Series or DataFrame containing only the valid values