Pandas Interpolation: {ValueError}Invalid fill method. Expecting pad (ffill) or backfill (bfill). Got linear

Tags:

dataframe

I am trying to interpolate time series data, df, which looks like:

         id      data        lat      notes    analysis_date
0  17358709       NaN  26.125979      None     2019-09-20 12:00:00+00:00
1  17358709       NaN  26.125979      None     2019-09-20 12:00:00+00:00
2  17352742 -2.331365  26.125979      None     2019-09-20 12:00:00+00:00
3  17358709 -4.424366  26.125979      None     2019-09-20 12:00:00+00:00

I try: df.groupby(['lat', 'lon']).apply(lambda group: group.interpolate(method='linear')), and it throws {ValueError}Invalid fill method. Expecting pad (ffill) or backfill (bfill). Got linear I suspect the issue is with the fact that I have None values, and I do not want to interpolate those. What is the solution?

df.dtypes gives me:

id                                                                int64
data                                                            float64
lat                                                             float64
notes                                                            object
analysis_date         datetime64[ns, psycopg2.tz.FixedOffsetTimezone...
dtype: object

752

asked Nov 13 '19 18:11

Preethi Vaidyanathan

1 Answers

DataFrame.interpolate has issues with timezone-aware datetime64ns columns, which leads to that rather cryptic error message. E.g.

import pandas as pd

df = pd.DataFrame({'time': pd.to_datetime(['2010', '2011', 'foo', '2012', '2013'], 
                                          errors='coerce')})
df['time'] = df.time.dt.tz_localize('UTC').dt.tz_convert('Asia/Kolkata')
df.interpolate()

ValueError: Invalid fill method. Expecting pad (ffill) or backfill (bfill). Got linear

In this case interpolating that column is unnecessary so only interpolate the column you need. We still want DataFrame.interpolate so select with [[ ]] (Series.interpolate leads to some odd reshaping)

df['data'] = df.groupby(['lat', 'lon']).apply(lambda x: x[['data']].interpolate())

135

answered Oct 05 '22 20:10

ALollz

Related questions
                            
                                Pandas column pairwise difference for each possible pair [duplicate]
                            
                                Sharing objects across workers using pyarrow
                            
                                Converting "year" and "week of year" columns to "date" in Pandas
                            
                                Pandas: How to read specific rows from a CSV file
                            
                                Finding closest value while grouping by a column
                            
                                Use the highest value for duplicate IDs (Pandas DataFrame)
                            
                                Pandas datetime week not as expected
                            
                                How to vectorize pandas dataframe forward column value search
                            
                                Pandas: Separate column containing semicolon into multiple columns based on the values
                            
                                Getting error while trying to read csv using pandas Python due to extra column values
                            
                                pandas create multiple dataframes based on duplicate index dataframe
                            
                                Delete rows preceeding and following a row containing NaN in Python?
                            
                                Repeated insertions into sqlite database via sqlalchemy causing memory leak?
                            
                                Bring a few columns to the front in a huge Panda DataFrame
                            
                                reindex MultiIndex on a level with dates that are "close"
                            
                                How to create a simple flag in Python
                            
                                How to add prefix to multi index columns at particular level?
                            
                                How to populate columns of a dataframe using a subset of another dataframe?
                            
                                Problem with custom Transformers for ColumnTransformer in scikit-learn
                            
                                Find Max of successive Similar Values

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With