Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pandas Interpolation: {ValueError}Invalid fill method. Expecting pad (ffill) or backfill (bfill). Got linear

I am trying to interpolate time series data, df, which looks like:

         id      data        lat      notes    analysis_date
0  17358709       NaN  26.125979      None     2019-09-20 12:00:00+00:00
1  17358709       NaN  26.125979      None     2019-09-20 12:00:00+00:00
2  17352742 -2.331365  26.125979      None     2019-09-20 12:00:00+00:00
3  17358709 -4.424366  26.125979      None     2019-09-20 12:00:00+00:00

I try: df.groupby(['lat', 'lon']).apply(lambda group: group.interpolate(method='linear')), and it throws {ValueError}Invalid fill method. Expecting pad (ffill) or backfill (bfill). Got linear I suspect the issue is with the fact that I have None values, and I do not want to interpolate those. What is the solution?

df.dtypes gives me:

id                                                                int64
data                                                            float64
lat                                                             float64
notes                                                            object
analysis_date         datetime64[ns, psycopg2.tz.FixedOffsetTimezone...
dtype: object
like image 752
Preethi Vaidyanathan Avatar asked Nov 13 '19 18:11

Preethi Vaidyanathan


People also ask

How do pandas interpolate missing values?

You can interpolate missing values ( NaN ) in pandas. DataFrame and Series with interpolate() . This article describes the following contents. Use dropna() and fillna() to remove missing values NaN or to fill them with a specific value.


1 Answers

DataFrame.interpolate has issues with timezone-aware datetime64ns columns, which leads to that rather cryptic error message. E.g.

import pandas as pd

df = pd.DataFrame({'time': pd.to_datetime(['2010', '2011', 'foo', '2012', '2013'], 
                                          errors='coerce')})
df['time'] = df.time.dt.tz_localize('UTC').dt.tz_convert('Asia/Kolkata')
df.interpolate()

ValueError: Invalid fill method. Expecting pad (ffill) or backfill (bfill). Got linear


In this case interpolating that column is unnecessary so only interpolate the column you need. We still want DataFrame.interpolate so select with [[ ]] (Series.interpolate leads to some odd reshaping)

df['data'] = df.groupby(['lat', 'lon']).apply(lambda x: x[['data']].interpolate())
like image 135
ALollz Avatar answered Oct 05 '22 20:10

ALollz



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!