Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Resample a time series with the index of another time series

I have 2 data frames with identical columns but different datetime indices. I want to resample one of them to use the index of the other and forward fill data from the one on any dates in the index of the other in which there wasn't data for.

import pandas as pd
import numpy as np
from datetime import datetime as dt

a_values = np.random.randn(4, 4)
a_index = [dt(2012, 3, 16), dt(2012, 3, 19), dt(2012, 3, 20), dt(2012, 3, 21)]
a = pd.DataFrame(data=a_values, index=a_index)

b_values = np.trunc(np.random.randn(3, 4) * 1000)
b_index = [dt(2012, 3, 16), dt(2012, 3, 19), dt(2012, 3, 21)]
b = pd.DataFrame(data=b_values, index=b_index)

c_insert = a.ix['2012-03-20']
c = b.append(c_insert).sort()
c.ix['2012-03-20'] = c.ix['2012-03-19']

'a' represents the data frame whose index I'd like to use as the resampling reference. 'b' represents the data frame I'd like to resample and forward fill data. 'c' represents what I'd like the results to look like.

Notice that 'b' is missing the '2012-03-20' index that exists in 'a'. 'c' populates the columns for index '2012-03-20' with the data in the columns from 'b' for index '2012-03-19'

Does pandas have the functionality to do this.

Thanks in advance.

PiR

like image 874
piRSquared Avatar asked Jun 06 '13 17:06

piRSquared


People also ask

What is resample time series?

Resample time-series data. Convenience method for frequency conversion and resampling of time series. The object must have a datetime-like index ( DatetimeIndex , PeriodIndex , or TimedeltaIndex ), or the caller must pass the label of a datetime-like series/index to the on / level keyword parameter.

Which is a typical reason to resample time series data?

Quoting the words from documentation, resample is a “Convenient method for frequency conversion and resampling of time series.” In practice, there are 2 main reasons why using resample. To inspect how data behaves differently under different resolutions or frequency. To join tables with different resolutions.

What is resample (' MS ') in Python?

The resample() function is used to resample time-series data. Convenience method for frequency conversion and resampling of time series. Object must have a datetime-like index (DatetimeIndex, PeriodIndex, or TimedeltaIndex), or pass datetime-like values to the on or level keyword.

What is resampling in time series and what are its type?

Resampling involves changing the frequency of your time series observations. Two types of resampling are: Upsampling: Where you increase the frequency of the samples, such as from minutes to seconds. Downsampling: Where you decrease the frequency of the samples, such as from days to months.


1 Answers

To resample by a reference index, use reindex.

In [11]: b.reindex(a.index, method='ffill')
Out[11]: 
               0     1     2     3
2012-03-16  -926  -625   736   457
2012-03-19 -1024   742   732 -1020
2012-03-20 -1024   742   732 -1020
2012-03-21  1090 -1163  1652   -94
like image 99
Dan Allan Avatar answered Oct 29 '22 22:10

Dan Allan