'Forward' cumulative sum in dplyr

Question

When examining datasets from longitudinal studies, I commonly get results like this from a dplyr analysis chain from the raw data:

df = data.frame(n_sessions=c(1,2,3,4,5), n_people=c(59,89,30,23,4))

i.e. a count of how many participants have completed a certain number of assessments at this point in time.

Although it is useful to know how many people have completed exactly n sessions, we more often need to know how many have completed at least n sessions. As per the table below, a standard cumulative sum isn't appropriate, What we want are the values in the n_total column, which is a sort of "forwards cumulative sum" of the values in the n_people column. i.e. the value in each row should be the sum of the values of itself and all values beyond it, rather than the standard cumulative sum, which is the sum of all values up to and including itself:

n_sessions n_people  n_total  cumsum
         1       59      205      59
         2       89      146     148
         3       30       57     178
         4       23       27     201
         5        4        4     205

Generating the cumulative sum is simple:

mutate(df, cumsum = cumsum(n_people))

What would be an expression for generating a "forwards cumulative sum" that could be incorporated in a dplyr analysis chain? I'm guessing that cumsum would need to be applied to n_people after sorting by n_sessions descending, but can't quite get my head around how to get the answer while preserving the original order of the data frame.

David Robinson · Accepted Answer

You can take a cumulative sum of the reversed vector, then reverse that result. The built-in rev function is helpful here:

mutate(df, rev_cumsum = rev(cumsum(rev(n_people))))

For example, on your data this returns:

  n_sessions n_people rev_cumsum
1          1       59        205
2          2       89        146
3          3       30         57
4          4       23         27
5          5        4          4

'Forward' cumulative sum in dplyr

Tags:

r

dplyr

Michael MacAskill

1 Answers

David Robinson

Recent Activity

Donate For Us

'Forward' cumulative sum in dplyr

Tags:

r

dplyr

Michael MacAskill

1 Answers

David Robinson

Related questions

Recent Activity

Donate For Us