cast#
- pdcleaner.cleaning.cleaning.cast(self, detector, **kwargs)[source]#
Clean by casting value into the specific target type. When the value is not castable, it is transformed to NaN. This method works only with the castable detector.
When casting into date, all parameters in pd.to_datetime method are allowed. See https://pandas.pydata.org/docs/reference/api/pandas.to_datetime.html
- Parameters:
detector (a Detector object,) – The detector obj that will identify entries to clean
- Return type:
The modified data
Examples
>>> series = pd.Series(['100 000', '154,5', '9 000', '250,12'], dtype='object') >>> detector = series.cleaner.detect.castable(target='float', thousands=' ', decimal=',') >>> series.cleaner.clean('cast', detector) 0 100000 1 154.5 2 9000 3 250.12 dtype: float64
Casting into int
>>> detector = series.cleaner.detect.castable(target='int', thousands=' ', decimal=',') >>> series.cleaner.clean('cast', detector) 0 100000 1 <NA> 2 9000 3 <NA> dtype: Int32
Casting into date
>>> series = pd.Series(['1.05', '154', '15/05/2022', 'Alice'], dtype='object') >>> detector = series.cleaner.detect.castable(target='date') >>> series.cleaner.clean('cast', detector, format="%d/%m/%Y") 0 NaT 1 NaT 2 2022-05-15 3 NaT dtype: datetime64[ns]