Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to compute cumulative sum of previous N rows in pandas?

I am working with pandas, but I don't have so much experience. I have the following DataFrame:

          A
0       NaN
1      0.00
2      0.00
3      3.33
4     10.21
5      6.67
6      7.00
7      8.27
8      6.07
9      2.17
10     3.38
11     2.48
12     2.08
13     6.95
14     0.00
15     1.75
16     6.66
17     9.69
18     6.73
19     6.20
20     3.01
21     0.32
22     0.52

and I need to compute the cumulative sum of the previous 11 rows. When there is less than 11 previously, they remaining are assumed to be 0.

        B
0     NaN
1    0.00
2    0.00
3    0.00
4    3.33
5    13.54
6    20.21
7    27.20
8    35.47
9    41.54
10    43.72
11   47.09
12   49.57 
13   51.65
14   58.60
15   58.60
16   57.02
17   53.48
18   56.49
19   56.22
20   54.16
21   51.10
22   49.24

I have tried:

df['B'] = df.A.cumsum().shift(-11).fillna(0)

However, this is not achieving what I want, but this is rotating the result of a cumulative sum. How can I achieve this?

like image 636
lmiguelvargasf Avatar asked May 04 '17 15:05

lmiguelvargasf


People also ask

How do you calculate cumulative sum in pandas?

The cumsum() method returns a DataFrame with the cumulative sum for each row. The cumsum() method goes through the values in the DataFrame, from the top, row by row, adding the values with the value from the previous row, ending up with a DataFrame where the last row contains the sum of all values for each column.

How do you calculate cumulative in Python?

cumsum() to find cumulative sum of a Series. Pandas Series. cumsum() is used to find Cumulative sum of a series. In cumulative sum, the length of returned series is same as input and every element is equal to sum of all previous elements.


3 Answers

Call rolling with min_periods=1 and window=11 and sum:

In [142]:
df['A'].rolling(min_periods=1, window=11).sum()

Out[142]:
0       NaN
1      0.00
2      0.00
3      3.33
4     13.54
5     20.21
6     27.21
7     35.48
8     41.55
9     43.72
10    47.10
11    49.58
12    51.66
13    58.61
14    55.28
15    46.82
16    46.81
17    49.50
18    47.96
19    48.09
20    48.93
21    45.87
22    43.91
Name: A, dtype: float64
like image 195
EdChum Avatar answered Oct 27 '22 20:10

EdChum


you might have to do it the hard way

B = []
i =0
m_lim = 11
while i<len(A):
    if i<m_lim:
      B.append(sum(A[0:i]))
    if i>=m_lim and i < len(A) -m_lim:
        B.append(sum(A[i-m_lim:i]))
    if i>= len(A) -m_lim:
      B.append(sum(A[i:]))
    i=i+1
df['B'] = B
like image 41
Mohammad Athar Avatar answered Oct 27 '22 20:10

Mohammad Athar


Check the pandas.Series.expanding. The series.expanding(min_periods=2).sum()

will do the job for you. And don't forget to set 0-th element, since it is NaN. I mean,

accumulation = series.expanding(min_periods=2).sum()
accumulation[0] = series[0] # or as you like
like image 2
Levon Minasian Avatar answered Oct 27 '22 19:10

Levon Minasian