Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to use days as window for pandas rolling_apply function

I have a pandas dataframe with irregularly spaced dates. Is there a way to use 7days as a moving window to calculate median absolute deviation, median etc..? I feel like I could somehow use pandas.rolling_apply but it does not take irregularly spaced dates for the window parameter. I found a similar post https://stackoverflow.com/a/30244019/3128336 and am trying to create my custom function but cannot still figure out.. Can anyone please help?

import pandas as pd
from datetime import datetime

person = ['A','B','C','B','A','C','A','B','C','A',]
ts = [
    datetime(2000, 1, 1),
    datetime(2000, 1, 1),
    datetime(2000, 1, 10),
    datetime(2000, 1, 20),
    datetime(2000, 1, 25),
    datetime(2000, 1, 30),
    datetime(2000, 2, 8),
    datetime(2000, 2, 12),
    datetime(2000, 2, 17),
    datetime(2000, 2, 20),
]
score = [9,2,1,3,8,4,2,3,1,9]
df = pd.DataFrame({'ts': ts, 'person': person, 'score': score})

df looks like this

    person  score   ts
0   A       9       2000-01-01
1   B       2       2000-01-01
2   C       1       2000-01-10
3   B       3       2000-01-20
4   A       8       2000-01-25
5   C       4       2000-01-30
6   A       2       2000-02-08
7   B       3       2000-02-12
8   C       1       2000-02-17
9   A       9       2000-02-20
like image 484
E.K. Avatar asked Feb 06 '16 15:02

E.K.


1 Answers

You can use a time delta to select rows within your window and then use apply to run through each row and aggregate:

>>> from datetime import timedelta
>>> delta = timedelta(days=7)
>>> df_score_mean = df.apply(lambda x: np.mean(df['score'][df['ts'] <= x['ts'] + delta]), axis=1)
0    5.500000
1    5.500000
2    4.000000
3    4.600000
4    4.500000
5    4.500000
6    4.555556
7    4.200000
8    4.200000
9    4.200000
like image 90
Brian Huey Avatar answered Nov 09 '22 18:11

Brian Huey