Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Synchronizing and Resampling two timeseries with non-uniform millisecond intraday data

I see in the python documentation the ability to resample and synchronize two timeseries. My problem is harder because there is no time regularity in the timeseries. I read three timeseries that have non-deterministic intraday timestamps. However, in order to do most analysis (covariances, correlations, etc) on those two timeseries, I need them to be of the same length.

In Matlab, given three time series ts1, ts2, ts3 with non-deterministic intraday timestamps, I can synchronize them by saying

[ts1, ts2] = synchronize(ts1, ts2, 'union');
[ts1, ts3] = synchronize(ts1, ts3, 'union');
[ts2, ts3] = synchronize(ts2, ts3, 'union');

Note that the time series are already read into a pandas DataFrame, so I need to be able to synchronize (and resample?) with already created DataFrames.

like image 336
Ivan Avatar asked Nov 23 '15 21:11

Ivan


1 Answers

According to the Matlab documentation that you've linked to, it sounds like you want to

Resample timeseries objects using a time vector that is a union of the time vectors of ts1 and ts2 on the time range where the two time vectors overlap.

So first you need to find the union of your dataframes' indices:

newindex = df1.index.union(df2.index)

Then you can recreate your dataframes using this index:

df1 = df1.reindex(newindex)
df2 = df2.reindex(newindex)

Note that they will have NaNs in all of their new entries (presumably this is the same behaviour as in Matlab), it's up to you if you want to fill these, for example fillna(method='pad') will fill in null values using the last known value, or you could use interpolate(method='time') to use linear interpolation based on the timestamps.

like image 199
maxymoo Avatar answered Nov 20 '22 12:11

maxymoo