replace#
- pdcleaner.cleaning.cleaning.replace(self, detector, value=None, inplace=False)[source]#
Clean by replacing errors by a value
- Parameters:
detector (a Detector object,) – The detector obj that will identify entries to clean
value (string, numeric, dict or callable) –
The replacement strategy. If a single value is given, errors will be replaced with that value.
If a dictionary is given, errors will be replaced with respect to the dictionary entries. If an erroneous value does not have a corresponding key in the value dict, it will be replaced with nan.
If a callable is given, it is computed on the Series. It should return a scalar or a Series. It must not change input series.
inplace (bool (Default: False)) – Whether to perform the operation in place on the data.
- Return type:
The modified data or None if inplace is True
Examples
>>> series = pd.Series([np.nan, 0, -5, 4, 6, 100, ]) >>> detector = series.cleaner.detect.bounded(lower=0, upper=10) >>> series.cleaner.clean.replace(detector, value=5) 0 NaN 1 0.0 2 5.0 3 4.0 4 6.0 5 5.0 dtype: float64
Replace using a dictionary
>>> series.cleaner.clean.replace(detector, value={100:10}) 0 NaN 1 0.0 2 NaN 3 4.0 4 6.0 5 10.0 dtype: float64
Replace using a lambda (the lambda applies to the series of erroneous entries)
>>> series.cleaner.clean.replace(detector, >>> value=lambda s: s.clip(lower=0) / 10 ) 0 NaN 1 0.0 2 0.0 3 4.0 4 6.0 5 10.0 dtype: float64