Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Interpolating one time series onto another in pandas

I have one set of values measured at regular times. Say:

import pandas as pd
import numpy as np
rng = pd.date_range('2013-01-01', periods=12, freq='H')
data = pd.Series(np.random.randn(len(rng)), index=rng)

And another set of more arbitrary times, for example, (in reality these times are not a regular sequence)

ts_rng = pd.date_range('2013-01-01 01:11:21', periods=7, freq='87Min')
ts = pd.Series(index=ts_rng)

I want to know the value of data interpolated at the times in ts.
I can do this in numpy:

x = np.asarray(ts_rng,dtype=np.float64)
xp = np.asarray(data.index,dtype=np.float64)
fp = np.asarray(data)
ts[:] = np.interp(x,xp,fp)

But I feel pandas has this functionality somewhere in resample, reindex etc. but I can't quite get it.

like image 597
elfnor Avatar asked Sep 23 '13 08:09

elfnor


People also ask

Which interpolation method is best for time series?

Linear interpolation works the best when we have many points.

What is interpolation time series?

Interpolation is mostly used while working with time-series data because in time-series data we like to fill missing values with previous one or two values. for example, suppose temperature, now we would always prefer to fill today's temperature with the mean of the last 2 days, not with the mean of the month.


2 Answers

You can concatenate the two time series and sort by index. Since the values in the second series are NaN you can interpolate and the just select out the values that represent the points from the second series:

 pd.concat([data, ts]).sort_index().interpolate().reindex(ts.index)

or

 pd.concat([data, ts]).sort_index().interpolate()[ts.index]
like image 118
Viktor Kerkez Avatar answered Oct 27 '22 01:10

Viktor Kerkez


Assume you would like to evaluate a time series ts on a different datetime_index. This index and the index of ts may overlap. I recommend to use the following groupby trick. This essentially gets rid of dubious double stamps. I then forward interpolate but feel free to apply more fancy methods

def interpolate(ts, datetime_index):
    x = pd.concat([ts, pd.Series(index=datetime_index)])
    return x.groupby(x.index).first().sort_index().fillna(method="ffill")[datetime_index]
like image 45
tschm Avatar answered Oct 26 '22 23:10

tschm