I tried to interpolate the NaN in my DataFrame using interpolate()
method. However, the method failed with error :
Cannot interpolate with all NaNs.
Here's the code:
try:
df3.interpolate(method='index', inplace=True)
processor._arma(df3['TCA'])
except Exception, e:
sys.stderr.write('%s: [%s] %s\n' % (time.strftime("%Y-%m-%d %H:%M:%S"), nid3, e))
sys.stderr.write('%s: [%s] len=%d\n' % (time.strftime("%Y-%m-%d %H:%M:%S"), nid3, len(df3.index)))
sys.stderr.write('%s: [%s] %s\n' % (time.strftime("%Y-%m-%d %H:%M:%S"), nid3, df3.to_string()))
This is strange, because most of the data is already filled, as you can see in log 1 or log 2. The length of the dataframe is 20, as all the data shown below. Even each cell is filled, I still can't use interpolate method. BTW, df3
is a global value, I'm not sure if it would be a problem.
log 1
2016-01-21 22:06:11: [ESIG_node_003_400585511] Cannot interpolate with all NaNs.
2016-01-21 22:06:11: [ESIG_node_003_400585511] len=20
2016-01-21 22:06:11: [ESIG_node_003_400585511]
TCA TCB TCC
2016-01-21 20:06:22 19 17 18
2016-01-21 20:06:23 19 17 18
2016-01-21 20:06:24 18 18 18
2016-01-21 20:06:25 18 17 18
2016-01-21 20:06:26 18 18 18
2016-01-21 20:06:27 19 18 18
2016-01-21 20:06:28 19 17 18
2016-01-21 20:06:29 18 18 18
2016-01-21 20:06:30 18 17 18
2016-01-21 20:06:31 19 17 18
2016-01-21 20:06:32 18 17 18
2016-01-21 20:06:33 18 18 18
2016-01-21 20:06:34 19 18 18
2016-01-21 20:06:35 18 17 18
2016-01-21 20:06:36 19 18 18
2016-01-21 20:06:37 18 18 18
2016-01-21 20:06:38 18 18 18
2016-01-21 20:06:39 19 18 18
2016-01-21 20:06:40 18 17 18
2016-01-21 20:06:41 18 18 18
log 2
2016-01-21 22:06:14: [ESIG_node_003_400585511] Cannot interpolate with all NaNs.
2016-01-21 22:06:14: [ESIG_node_003_400585511] len=20
2016-01-21 22:06:14: [ESIG_node_003_400585511]
TCA TCB TCC
2016-01-21 20:06:33 18 18 18
2016-01-21 20:06:34 19 18 18
2016-01-21 20:06:35 18 17 18
2016-01-21 20:06:36 19 18 18
2016-01-21 20:06:37 18 18 18
2016-01-21 20:06:38 18 18 18
2016-01-21 20:06:39 19 18 18
2016-01-21 20:06:40 18 17 18
2016-01-21 20:06:41 18 18 18
2016-01-21 20:06:42 NaN NaN NaN
2016-01-21 20:06:43 NaN NaN NaN
2016-01-21 20:06:44 NaN NaN NaN
2016-01-21 20:06:45 NaN NaN NaN
2016-01-21 20:06:46 19 18 18
2016-01-21 20:06:47 18 17 18
2016-01-21 20:06:48 18 18 18
2016-01-21 20:06:49 19 18 18
2016-01-21 20:06:50 18 17 18
2016-01-21 20:06:51 18 18 18
2016-01-21 20:06:52 19 17 18
Check that your DataFrame has numeric dtypes, not object
dtypes. The
TypeError: Cannot interpolate with all NaNs
can occur if the DataFrame
contains columns of object
dtype. For example, if
import numpy as np
import pandas as pd
df = pd.DataFrame({'A':np.array([1,np.nan,30], dtype='O')},
index=['2016-01-21 20:06:22', '2016-01-21 20:06:23',
'2016-01-21 20:06:24'])
then df.interpolate()
raises the TypeError.
To check if your DataFrame has columns with object dtype, look at df3.dtypes
:
In [92]: df.dtypes
Out[92]:
A object
dtype: object
To fix the problem, you need to ensure the DataFrame has numeric columns with native NumPy dtypes. Obviously, it would be best to build the DataFrame correctly from the very beginning. So the best solution depends on how you are building the DataFrame.
A less appealing patch-up fix would be to use pd.to_numeric
to convert the object arrays to numeric arrays after-the-fact:
for col in df:
df[col] = pd.to_numeric(df[col], errors='coerce')
With errors='coerce'
, any value that could not be converted to a number is converted to NaN. After calling pd.to_numeric
on each column, notice that the dtype is now float64
:
In [94]: df.dtypes
Out[94]:
A float64
dtype: object
Once the DataFrame has numeric dtypes, and the DataFrame has a DatetimeIndex, then df.interpolate(method='time')
will work:
import numpy as np
import pandas as pd
df = pd.DataFrame({'A':np.array([1,np.nan,30], dtype='O')},
index=['2016-01-21 20:06:22', '2016-01-21 20:06:23',
'2016-01-21 20:06:24'])
for col in df:
df[col] = pd.to_numeric(df[col], errors='coerce')
df.index = pd.DatetimeIndex(df.index)
df = df.interpolate(method='time')
print(df)
yields
A
2016-01-21 20:06:22 1.0
2016-01-21 20:06:23 15.5
2016-01-21 20:06:24 30.0
I had a similar problem, recreated the dataframe with definition of dtype as float (e.g. dtype='float32'
). it fixed.
df = pd.DataFrame(data = df.values, columns= cols, dtype='float32')
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With