Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pandas interpolate() backwards in dataframe

Going forward, interpolate works great:

       name    days
0      a       NaN
1      a       NaN
2      a         2
3      a         3
4      a       NaN 
5      a       NaN  

records.loc[:, 'days'].interpolate(method='linear', inplace=True)

       name    days
0      a       NaN
1      a       NaN
2      a         2
3      a         3
4      a         4 
5      a         5  

...however, it does not address the beginning rows (only goes forward). The limit_direction param allows {‘forward’, ‘backward’, ‘both’}. None of these works. Is there a proper way to interpolate backwards?

We can assume a series incrementing or decrementing by 1, which may not start at 0 as it happens to in this example.

like image 646
Brian Bien Avatar asked Nov 14 '16 05:11

Brian Bien


People also ask

How does Panda interpolate work?

interpolate() function is basically used to fill NA values in the dataframe or series. But, this is a very powerful function to fill the missing values. It uses various interpolation technique to fill the missing values rather than hard-coding the value.

How do pandas interpolate missing values?

You can interpolate missing values ( NaN ) in pandas. DataFrame and Series with interpolate() . This article describes the following contents. Use dropna() and fillna() to remove missing values NaN or to fill them with a specific value.

What is interpolate in DF?

Pandas DataFrame interpolate() Method The interpolate() method replaces the NULL values based on a specified method.


1 Answers

It seems it works only with parameter limit see docs [In 47]:

Add a limit_direction keyword argument that works with limit to enable interpolate to fill NaN values forward, backward, or both (GH9218, GH10420, GH11115)

records = pd.DataFrame(
{'name': {0: 'a', 1: 'a', 2: 'a', 3: 'a', 4: 'a', 5: 'a', 6: 'a', 7: 'a', 8: 'a', 9: 'a'}, 
'days': {0: 0.0, 1: np.nan, 2: np.nan, 3: np.nan, 4: 4.0, 5: 5.0, 6: np.nan, 7: np.nan, 8: np.nan, 9: 9.0}}, 
columns=['name','days'])

print (records)
  name  days
0    a   0.0
1    a   NaN
2    a   NaN
3    a   NaN
4    a   4.0
5    a   5.0
6    a   NaN
7    a   NaN
8    a   NaN
9    a   9.0
#by default limit_direction='forward'
records['forw'] = records['days'].interpolate(method='linear', 
                                              limit=1)
records['backw'] = records['days'].interpolate(method='linear',
                                               limit_direction='backward', 
                                               limit=1)
records['both'] = records['days'].interpolate(method='linear', 
                                              limit_direction='both', 
                                              limit=1)
print (records)
  name  days  forw  backw  both
0    a   0.0   0.0    0.0   0.0
1    a   NaN   1.0    NaN   1.0
2    a   NaN   NaN    NaN   NaN
3    a   NaN   NaN    3.0   3.0
4    a   4.0   4.0    4.0   4.0
5    a   5.0   5.0    5.0   5.0
6    a   NaN   6.0    NaN   6.0
7    a   NaN   NaN    NaN   NaN
8    a   NaN   NaN    8.0   8.0
9    a   9.0   9.0    9.0   9.0
like image 53
jezrael Avatar answered Oct 17 '22 22:10

jezrael