There are similar questions, but my datetime
objects are very spatial and not ordered e.g. they are random timestamps in time. Basically what I need is to use rolling()
but roll it over the 2nd index while remembering the group (1st index).
There is a very similar GitHub issue that you might also want to contribute to: https://github.com/pandas-dev/pandas/issues/15584
Code to reproduce:
import pandas as pd
data = {
'id': ['A','A','A','B'],
'time': pd.to_datetime(['2018-01-04 08:13:51.181','2018-01-04 08:13:55.181','2018-01-04 09:13:51.181', '2018-01-04 08:13:51.183']),
'colA': [4,3,2,1],
'30min_rolling_output': [4,7,2,1],
'1day_rolling_output': [4,7,9,1]
}
test_df = pd.DataFrame(data=data).set_index(['id', 'time'])
The desired output is assuming the 30m
and 1h
arguments.
Visualising:
colA 30min_rolling_output 1day_rolling_output
id date
A 2018-01-04 08:13:51.181 4 4 4
2018-01-04 08:13:55.181 3 7 7
2018-01-04 09:13:51.181 2 2 9
B 2018-01-04 08:13:51.183 1 1 1
Remove id
from the index, leaving you with a DatetimeIndex
that you can then roll.
test_df['30min'] = test_df.reset_index(level=0).groupby('id').colA.rolling('30min').sum()
test_df['1day'] = test_df.reset_index(level=0).groupby('id').colA.rolling('1d').sum()
colA 30min 1day
id time
A 2018-01-04 08:13:51.181 4 4.0 4.0
2018-01-04 08:13:55.181 3 7.0 7.0
2018-01-04 09:13:51.181 2 2.0 9.0
B 2018-01-04 08:13:51.183 1 1.0 1.0
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With