Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I do an interpolating reindex in pandas using datetime indices?

I have a series with a datetime index, and what I'd like is to interpolate this data using some other, arbitrary datetime index. Essentially what I want is how to make the following code snippet more or less work:

from pandas import Series
import datetime

datetime_index = [datetime.datetime(2010, 1, 5), datetime.datetime(2010, 1, 10)]
data_series = Series([5, 15], [datetime.datetime(2010, 1, 5), datetime.datetime(2010, 1, 15)])

def interpolating_reindex(data_series, datetime_index):
    """?????"""

goal_series = interpolating_reindex(data_series, datetime_index) 

assert(goal_series == Series([5, 10], datetime_index))

reindex doesn't do what I want because it can't interpolate, and also my series might not have the same indices anyway. resample isn't what I want because I want to use an arbitrary, already defined index which isn't necessarily periodic. I've also tried combining indices using Index.join in the hopes that I could then do reindex and then interpolate, but that didn't work as I expected. Any pointers?

like image 755
Kevin S Avatar asked May 21 '14 01:05

Kevin S


People also ask

How do I reindex in Pandas?

One can reindex a single column or multiple columns by using reindex() method and by specifying the axis we want to reindex. Default values in the new index that are not present in the dataframe are assigned NaN.

How do Pandas interpolate missing values?

You can interpolate missing values ( NaN ) in pandas. DataFrame and Series with interpolate() . This article describes the following contents. Use dropna() and fillna() to remove missing values NaN or to fill them with a specific value.


1 Answers

Try this:

from pandas import Series
import datetime

datetime_index = [datetime.datetime(2010, 1, 5), datetime.datetime(2010, 1, 10)]
s1 = Series([5, 15], [datetime.datetime(2010, 1, 5), datetime.datetime(2010, 1, 15)])
s2 = Series(None, datetime_index)
s3 = s1.combine_first(s2)
s3.interpolate()

Based on the comments, the result interpolated to the target index would be:

goal_series  = s3.interpolate().reindex(datetime_index)

assert((goal_series == Series([5, 10], datetime_index)).all())
like image 196
HYRY Avatar answered Nov 14 '22 23:11

HYRY