Pandas timeseries comparison

Question

I want to get the average difference between the overlapping parts of two time series. However, both their ranges and their intervals are different. What is the best way to solve these two problems?

Example data:

Series 1:
2014-08-05 05:03:00    25.194      
2014-08-05 05:08:00    25.196      
2014-08-05 05:13:00    25.197      
2014-08-05 05:18:00    25.199      
2014-08-05 05:23:00    25.192      

Series 2:
2014-08-05 05:12:00    25.000000
2014-08-05 05:13:00    25.000000
2014-08-05 05:14:00    25.000000

joris · Accepted Answer

Is this what you are looking for?

First you can align both serieses (so they both have the same indices. It is also possible to only reindex one of both to the index of the other with reindex):

In [85]: s1, s2 = s1.align(s2)

In [86]: s1
Out[86]: 
2014-08-05 05:03:00    25.194
2014-08-05 05:08:00    25.196
2014-08-05 05:12:00       NaN
2014-08-05 05:13:00    25.197
2014-08-05 05:14:00       NaN
2014-08-05 05:18:00    25.199
2014-08-05 05:23:00    25.192
dtype: float64

In [87]: s2
Out[87]: 
2014-08-05 05:03:00   NaN
2014-08-05 05:08:00   NaN
2014-08-05 05:12:00    25
2014-08-05 05:13:00    25
2014-08-05 05:14:00    25
2014-08-05 05:18:00   NaN
2014-08-05 05:23:00   NaN
dtype: float64

Then you can interpolate the missing values (eg with linear interpolation based on the time index):

In [88]: s1.interpolate(method='time')
Out[88]: 
2014-08-05 05:03:00    25.1940
2014-08-05 05:08:00    25.1960
2014-08-05 05:12:00    25.1968
2014-08-05 05:13:00    25.1970
2014-08-05 05:14:00    25.1974
2014-08-05 05:18:00    25.1990
2014-08-05 05:23:00    25.1920
dtype: float64

And then just substract both serieses to get the difference:

In [91]: s = s1.interpolate(method='time') - s2.interpolate(method='time')

In [92]: s
Out[92]: 
2014-08-05 05:03:00       NaN
2014-08-05 05:08:00       NaN
2014-08-05 05:12:00    0.1968
2014-08-05 05:13:00    0.1970
2014-08-05 05:14:00    0.1974
2014-08-05 05:18:00    0.1990
2014-08-05 05:23:00    0.1920
dtype: float64

In [93]: s.mean()
Out[93]: 0.19643999999999906

Pandas timeseries comparison

Tags:

pandas

time-series

resampling

mikael

1 Answers

joris

Recent Activity

Donate For Us

Pandas timeseries comparison

Tags:

pandas

time-series

resampling

mikael

1 Answers

joris

Related questions

Recent Activity

Donate For Us