Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Calculate cumulative sum (cumsum) by group

Tags:

r

cumsum

With data frame:

df <- data.frame(id = rep(1:3, each = 5)                  , hour = rep(1:5, 3)                  , value = sample(1:15)) 

I want to add a cumulative sum column that matches the id:

df    id hour value csum 1   1    1     7    7 2   1    2     9   16 3   1    3    15   31 4   1    4    11   42 5   1    5    14   56 6   2    1    10   10 7   2    2     2   12 8   2    3     5   17 9   2    4     6   23 10  2    5     4   27 11  3    1     1    1 12  3    2    13   14 13  3    3     8   22 14  3    4     3   25 15  3    5    12   37 

How can I do this efficiently? Thanks!

like image 478
Rock Avatar asked May 31 '13 05:05

Rock


People also ask

How do you find the cumulative sum in pandas?

The cumsum() method returns a DataFrame with the cumulative sum for each row. The cumsum() method goes through the values in the DataFrame, from the top, row by row, adding the values with the value from the previous row, ending up with a DataFrame where the last row contains the sum of all values for each column.

What is cumulative sum example?

The definition of the cumulative sum is the sum of a given sequence that is increasing or getting bigger with more additions. The real example of a cumulative sum is the increasing amount of water in a swing pool. Example: Input: 10, 15, 20, 25, 30. Output: 10, 25, 45, 70, 100.


1 Answers

df$csum <- ave(df$value, df$id, FUN=cumsum) 

ave is the "go-to" function if you want a by-group vector of equal length to an existing vector and it can be computed from those sub vectors alone. If you need by-group processing based on multiple "parallel" values, the base strategy is do.call(rbind, by(dfrm, grp, FUN)).

like image 152
IRTFM Avatar answered Sep 28 '22 11:09

IRTFM