Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Interpolation on DataFrame in pandas

Tags:

python

pandas

I have a DataFrame, say a volatility surface with index as time and column as strike. How do I do two dimensional interpolation? I can reindex but how do i deal with NaN? I know we can fillna(method='pad') but it is not even linear interpolation. Is there a way we can plug in our own method to do interpolation?

like image 227
archlight Avatar asked May 05 '12 18:05

archlight


People also ask

How do pandas interpolate missing values?

You can interpolate missing values ( NaN ) in pandas. DataFrame and Series with interpolate() . This article describes the following contents. Use dropna() and fillna() to remove missing values NaN or to fill them with a specific value.

What does interpolate mean in pandas?

Introduction. Interpolation is a technique in Python used to estimate unknown data points between two known data points. Interpolation is mostly used to impute missing values in the dataframe or series while preprocessing data.


1 Answers

You can use DataFrame.interpolate to get a linear interpolation.

In : df = pandas.DataFrame(numpy.random.randn(5,3), index=['a','c','d','e','g'])  In : df Out:           0         1         2 a -1.987879 -2.028572  0.024493 c  2.092605 -1.429537  0.204811 d  0.767215  1.077814  0.565666 e -1.027733  1.330702 -0.490780 g -1.632493  0.938456  0.492695  In : df2 = df.reindex(['a','b','c','d','e','f','g'])  In : df2 Out:           0         1         2 a -1.987879 -2.028572  0.024493 b       NaN       NaN       NaN c  2.092605 -1.429537  0.204811 d  0.767215  1.077814  0.565666 e -1.027733  1.330702 -0.490780 f       NaN       NaN       NaN g -1.632493  0.938456  0.492695  In : df2.interpolate() Out:           0         1         2 a -1.987879 -2.028572  0.024493 b  0.052363 -1.729055  0.114652 c  2.092605 -1.429537  0.204811 d  0.767215  1.077814  0.565666 e -1.027733  1.330702 -0.490780 f -1.330113  1.134579  0.000958 g -1.632493  0.938456  0.492695 

For anything more complex, you need to roll-out your own function that will deal with a Series object and fill NaN values as you like and return another Series object.

like image 187
Avaris Avatar answered Sep 19 '22 10:09

Avaris