Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Cumulative Sum Function on Pandas Data Frame

Tags:

python

pandas

I am attempting to capture a "running" cumulative sum given a series of period amounts.

See example:

enter image description here

df = df[1:4].cumsum() # this doesn't return the desired result
like image 659
AME Avatar asked Oct 23 '15 19:10

AME


People also ask

How do you calculate cumulative sum in pandas DataFrame?

The cumsum() method returns a DataFrame with the cumulative sum for each row. The cumsum() method goes through the values in the DataFrame, from the top, row by row, adding the values with the value from the previous row, ending up with a DataFrame where the last row contains the sum of all values for each column.

How do you find the cumulative summation of values for each row?

To find the cumulative sum for each row in an R data frame, we would need to read the data frame as a data. table object and then Reduce function will be used with accumulate argument.


1 Answers

You're looking for the axis parameter. Many Pandas functions take this argument to apply an operation across the columns or across the rows. Use axis=0 to apply row-wise and axis=1 to apply column-wise. This operation is actually traversing the columns, so you want axis=1.

df.cumsum(axis=1) by itself works on your example to produce the output table.

In [3]: df.cumsum(axis=1)
Out[3]:
      1   2   3   4
10   16  30  41  61
51   13  29  40  50
13   11  30  45  61
321  12  27  37  52

I suspect you're interested in restricting to a specific range of columns, though. To do that, you can use .loc with the column labels (strings in mine).

In [4]: df.loc[:, '2':'3'].cumsum(axis=1)
Out[4]:
      2   3
10   14  25
51   16  27
13   19  34
321  15  25

.loc is label-based and is inclusive of the bounds. If you want to find out more about indexing in Pandas, check the docs.

like image 60
Brian Avatar answered Sep 21 '22 10:09

Brian