I have a DataFrame, say a volatility surface with index as time and column as strike. How do I do two dimensional interpolation? I can reindex but how do i deal with NaN? I know we can fillna(method='pad') but it is not even linear interpolation. Is there a way we can plug in our own method to do interpolation?
You can interpolate missing values ( NaN ) in pandas. DataFrame and Series with interpolate() . This article describes the following contents. Use dropna() and fillna() to remove missing values NaN or to fill them with a specific value.
Introduction. Interpolation is a technique in Python used to estimate unknown data points between two known data points. Interpolation is mostly used to impute missing values in the dataframe or series while preprocessing data.
You can use DataFrame.interpolate to get a linear interpolation.
In : df = pandas.DataFrame(numpy.random.randn(5,3), index=['a','c','d','e','g']) In : df Out: 0 1 2 a -1.987879 -2.028572 0.024493 c 2.092605 -1.429537 0.204811 d 0.767215 1.077814 0.565666 e -1.027733 1.330702 -0.490780 g -1.632493 0.938456 0.492695 In : df2 = df.reindex(['a','b','c','d','e','f','g']) In : df2 Out: 0 1 2 a -1.987879 -2.028572 0.024493 b NaN NaN NaN c 2.092605 -1.429537 0.204811 d 0.767215 1.077814 0.565666 e -1.027733 1.330702 -0.490780 f NaN NaN NaN g -1.632493 0.938456 0.492695 In : df2.interpolate() Out: 0 1 2 a -1.987879 -2.028572 0.024493 b 0.052363 -1.729055 0.114652 c 2.092605 -1.429537 0.204811 d 0.767215 1.077814 0.565666 e -1.027733 1.330702 -0.490780 f -1.330113 1.134579 0.000958 g -1.632493 0.938456 0.492695 For anything more complex, you need to roll-out your own function that will deal with a Series object and fill NaN values as you like and return another Series object.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With