Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do I perform a summation of `n` rows at a time in pandas? [duplicate]

Given a data frame

     A
0   14
1   59
2   38
3   40
4   99
5   89
6   70
7   64
8   84
9   40
10  30
11  94
12  65
13  29
14  48
15  26
16  80
17  79
18  74
19  69

This data frame has 20 columns. I would like to group n=5 rows at a time and sum them up. So, my output would look like this:

     A
0  250
1  347
2  266
3  328 

df.rolling_sum will not help because it does not allow you to vary the stride when summing.

What other ways are there to do this?

like image 677
cs95 Avatar asked Jan 03 '23 15:01

cs95


2 Answers

df.set_index(df.index // 5).sum(level=0)
like image 164
piRSquared Avatar answered Jan 06 '23 05:01

piRSquared


If you can manage an ndarray with the sums as opposed to a Series (you could always construct a Series again anyhow), you could use np.add.reduceat.

np.add.reduceat(df.A.values, np.arange(0, df.A.size, 5))

Which in this case returns

array([250, 347, 266, 328])
like image 38
miradulo Avatar answered Jan 06 '23 05:01

miradulo