I am trying to learn Pandas library for Python, and then I came across the concept of "Rolling Window" for time-series analysis. I have never been a good student of Statistics, so I am a bit lost.
Please explain the concept, preferably using a simple example, and maybe a code snippet.
Demo:
Setup:
In [11]: df = pd.DataFrame({'a':np.arange(10, 17)})
In [12]: df
Out[12]:
a
0 10
1 11
2 12
3 13
4 14
5 15
6 16
Rolling sum for the 2 rows
window:
In [13]: df['a'].rolling(2).sum()
Out[13]:
0 NaN # sum of the current and previous value: 10 + NaN = NaN
1 21.0 # sum of the current and previous value: 10 + 11
2 23.0 # sum of the current and previous value: 11 + 12
3 25.0 # ...
4 27.0
5 29.0
6 31.0
Name: a, dtype: float64
Rolling sum for the 3 rows
window:
In [14]: df['a'].rolling(3).sum()
Out[14]:
0 NaN # sum of current value and two preceeding rows: 10 + NaN + Nan
1 NaN # sum of current value and two preceeding rows: 10 + 11 + Nan
2 33.0 # sum of current value and two preceeding rows: 10 + 11 + 12
3 36.0 # ...
4 39.0
5 42.0
6 45.0
Name: a, dtype: float64
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With