Suppose I have a time series:
In[138] rng = pd.date_range('1/10/2011', periods=10, freq='D') In[139] ts = pd.Series(randn(len(rng)), index=rng) In[140] Out[140]: 2011-01-10 0 2011-01-11 1 2011-01-12 2 2011-01-13 3 2011-01-14 4 2011-01-15 5 2011-01-16 6 2011-01-17 7 2011-01-18 8 2011-01-19 9 Freq: D, dtype: int64
If I use one of the rolling_* functions, for instance rolling_sum, I can get the behavior I want for backward looking rolling calculations:
In [157]: pd.rolling_sum(ts, window=3, min_periods=0) Out[157]: 2011-01-10 0 2011-01-11 1 2011-01-12 3 2011-01-13 6 2011-01-14 9 2011-01-15 12 2011-01-16 15 2011-01-17 18 2011-01-18 21 2011-01-19 24 Freq: D, dtype: float64
But what if I want to do a forward-looking sum? I've tried something like this:
In [161]: pd.rolling_sum(ts.shift(-2, freq='D'), window=3, min_periods=0) Out[161]: 2011-01-08 0 2011-01-09 1 2011-01-10 3 2011-01-11 6 2011-01-12 9 2011-01-13 12 2011-01-14 15 2011-01-15 18 2011-01-16 21 2011-01-17 24 Freq: D, dtype: float64
But that's not exactly the behavior I want. What I am looking for as an output is:
2011-01-10 3 2011-01-11 6 2011-01-12 9 2011-01-13 12 2011-01-14 15 2011-01-15 18 2011-01-16 21 2011-01-17 24 2011-01-18 17 2011-01-19 9
ie - I want the sum of the "current" day plus the next two days. My current solution is not sufficient because I care about what happens at the edges. I know I could solve this manually by setting up two additional columns that are shifted by 1 and 2 days respectively and then summing the three columns, but there's got to be a more elegant solution.
Why not just do it on the reversed Series (and reverse the answer):
In [11]: pd.rolling_sum(ts[::-1], window=3, min_periods=0)[::-1] Out[11]: 2011-01-10 3 2011-01-11 6 2011-01-12 9 2011-01-13 12 2011-01-14 15 2011-01-15 18 2011-01-16 21 2011-01-17 24 2011-01-18 17 2011-01-19 9 Freq: D, dtype: float64
I struggled with this then found an easy way using shift.
If you want a rolling sum for the next 10 periods, try:
df['NewCol'] = df['OtherCol'].shift(-10).rolling(10, min_periods = 0).sum()
We use shift so that "OtherCol" shows up 10 rows ahead of where it normally would be, then we do a rolling sum over the previous 10 rows. Because we shifted, the previous 10 rows are actually the future 10 rows of the unshifted column. :)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With