Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Select the maximum/minimum from n previous rows in a DataFrame

I'm using pandas in Python and I have an issue to select some data. I have DataFrame with float values, and I would like to create a column which contains the maximum (or minimum) of the n previous rows of a column, and set to 0 for the n first rows, here's an example of the result I would like to have:

df_test = pd.DataFrame({'a':[2,7,2,0,-1, 19, -52, 2]})
df_test['result_i_want_with_n=3'] = [0, 0, 0, 7, 7, 2, 19, 19]
print(df_test)
    a   result_i_want_with_n=3
0   2   0
1   7   0
2   2   0
3   0   7
4   -1  7
5   19  2
6   -52 19
7   2   19

I managed to get this result using a while, but I would like to program it in a more "pandas" way to gain computation speed.

Thanks

like image 356
Antoine Avatar asked Jul 04 '17 12:07

Antoine


1 Answers

Rolling is your friend here. You need to shift by one row in order to get your exact result, otherwise your first value will be in the third row.

df_test.rolling(window=3).max().shift(1).fillna(0)

0     0.0
1     0.0
2     0.0
3     7.0
4     7.0
5     2.0
6    19.0
7    19.0
like image 80
P.Tillmann Avatar answered Oct 03 '22 13:10

P.Tillmann