I have a pandas dataframe df with pandas.tseries.index.DatetimeIndex as index.
The data is like this:
Time Open High Low Close Volume
2007-04-01 21:02:00 1.968 2.389 1.968 2.389 18.300000
2007-04-01 21:03:00 157.140 157.140 157.140 157.140 2.400000
....
I want to replace one datapoint, lets day 2.389 in column Close with NaN:
In: df["Close"].replace(2.389, np.nan)
Out: 2007-04-01 21:02:00 2.389
2007-04-01 21:03:00 157.140
Replace did not change 2.389 to NaN. Whats wrong?
replace
might not work with floats because the floating point representation you see in the repr
of the DataFrame might not be the same as the underlying float. For example, the actual Close value might be:
In [141]: df = pd.DataFrame({'Close': [2.389000000001]})
yet the repr of df
looks like:
In [142]: df
Out[142]:
Close
0 2.389
So instead of checking for float equality, it is usually better to check for closeness:
In [150]: import numpy as np
In [151]: mask = np.isclose(df['Close'], 2.389)
In [152]: mask
Out[152]: array([ True], dtype=bool)
You can then use the boolean mask to select and change the desired values:
In [145]: df.loc[mask, 'Close'] = np.nan
In [146]: df
Out[146]:
Close
0 NaN
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With