Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pandas DataFrame: difference between rolling and expanding function

Tags:

python

pandas

Can anyone help me understand the difference between rolling and expanding function from the example given in the pandas docs.

df = DataFrame({'B': [0, 1, 2, np.nan, 4]})
df
     B
0  0.0
1  1.0
2  2.0
3  NaN
4  4.0


df.expanding(2).sum()
     B
0  NaN  # 0 + NaN
1  1.0  # 1 + 0
2  3.0  # 2 + 1
3  3.0  # ??
4  7.0  # ?? 

df.rolling(2).sum()
     B
0  NaN  # 0 + NaN
1  1.0  # 1 + 0
2  3.0  # 2 + 1
3  NaN  # NaN + 2
4  NaN  # 4 + NaN

I give comment to each row to show my understanding of the calculation. Is that true for rolling function? What about expanding? Where are 3 and 7 in 3rd and 4th rows coming from?

like image 600
ipramusinto Avatar asked Nov 07 '18 21:11

ipramusinto


1 Answers

The 2 in expanding is min_periods not the window

df.expanding(min_periods=1).sum()
Out[117]: 
     B
0  0.0
1  1.0
2  3.0
3  3.0
4  7.0

If you want the same result with rolling window will be equal to the length of dataframe

df.rolling(window=len(df),min_periods=1).sum()
Out[116]: 
     B
0  0.0
1  1.0
2  3.0
3  3.0
4  7.0
like image 83
BENY Avatar answered Oct 11 '22 10:10

BENY