Logo Questions Linux Laravel Mysql Ubuntu Git Menu

Pandas Rolling On DateTime Multi Index Frame




There are similar questions, but my datetime objects are very spatial and not ordered e.g. they are random timestamps in time. Basically what I need is to use rolling() but roll it over the 2nd index while remembering the group (1st index).

There is a very similar GitHub issue that you might also want to contribute to: https://github.com/pandas-dev/pandas/issues/15584

Code to reproduce:

import pandas as pd
data = {
    'id': ['A','A','A','B'],
    'time': pd.to_datetime(['2018-01-04 08:13:51.181','2018-01-04 08:13:55.181','2018-01-04 09:13:51.181', '2018-01-04 08:13:51.183']),
    'colA': [4,3,2,1],
    '30min_rolling_output': [4,7,2,1],
    '1day_rolling_output': [4,7,9,1]
test_df = pd.DataFrame(data=data).set_index(['id', 'time'])

The desired output is assuming the 30m and 1h arguments.


                            colA  30min_rolling_output  1day_rolling_output
id date                                                          
A  2018-01-04 08:13:51.181     4                     4                    4
   2018-01-04 08:13:55.181     3                     7                    7
   2018-01-04 09:13:51.181     2                     2                    9
B  2018-01-04 08:13:51.183     1                     1                    1
like image 946
user10430178 Avatar asked Sep 28 '18 15:09


1 Answers

Remove id from the index, leaving you with a DatetimeIndex that you can then roll.

test_df['30min'] = test_df.reset_index(level=0).groupby('id').colA.rolling('30min').sum()
test_df['1day'] = test_df.reset_index(level=0).groupby('id').colA.rolling('1d').sum()


                            colA  30min  1day
id time                                      
A  2018-01-04 08:13:51.181     4    4.0   4.0
   2018-01-04 08:13:55.181     3    7.0   7.0
   2018-01-04 09:13:51.181     2    2.0   9.0
B  2018-01-04 08:13:51.183     1    1.0   1.0
like image 96
ALollz Avatar answered Oct 18 '22 01:10
