I have a DataFrame, say a volatility surface with index as time and column as strike. How do I do two dimensional interpolation? I can reindex
but how do i deal with NaN
? I know we can fillna(method='pad')
but it is not even linear interpolation. Is there a way we can plug in our own method to do interpolation?
You can interpolate missing values ( NaN ) in pandas. DataFrame and Series with interpolate() . This article describes the following contents. Use dropna() and fillna() to remove missing values NaN or to fill them with a specific value.
Introduction. Interpolation is a technique in Python used to estimate unknown data points between two known data points. Interpolation is mostly used to impute missing values in the dataframe or series while preprocessing data.
You can use DataFrame.interpolate
to get a linear interpolation.
In : df = pandas.DataFrame(numpy.random.randn(5,3), index=['a','c','d','e','g']) In : df Out: 0 1 2 a -1.987879 -2.028572 0.024493 c 2.092605 -1.429537 0.204811 d 0.767215 1.077814 0.565666 e -1.027733 1.330702 -0.490780 g -1.632493 0.938456 0.492695 In : df2 = df.reindex(['a','b','c','d','e','f','g']) In : df2 Out: 0 1 2 a -1.987879 -2.028572 0.024493 b NaN NaN NaN c 2.092605 -1.429537 0.204811 d 0.767215 1.077814 0.565666 e -1.027733 1.330702 -0.490780 f NaN NaN NaN g -1.632493 0.938456 0.492695 In : df2.interpolate() Out: 0 1 2 a -1.987879 -2.028572 0.024493 b 0.052363 -1.729055 0.114652 c 2.092605 -1.429537 0.204811 d 0.767215 1.077814 0.565666 e -1.027733 1.330702 -0.490780 f -1.330113 1.134579 0.000958 g -1.632493 0.938456 0.492695
For anything more complex, you need to roll-out your own function that will deal with a Series
object and fill NaN
values as you like and return another Series
object.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With