For the following dataset:
d = data.frame(date = as.Date(as.Date('2015-01-01'):as.Date('2015-04-10'), origin = "1970-01-01"),
group = rep(c('A','B','C','D'), 25), value = sample(1:100))
head(d)
date group value
1: 2015-01-01 A 4
2: 2015-01-02 B 32
3: 2015-01-03 C 46
4: 2015-01-04 D 40
5: 2015-01-05 A 93
6: 2015-01-06 B 10
.. can anyone advise a more elegant way to calculate a cumulative total of values by group than this data.table) method?
library(data.table)
setDT(d)
d.cast = dcast.data.table(d, group ~ date, value.var = 'value', fun.aggregate = sum)
c.sum = d.cast[, as.list(cumsum(unlist(.SD))), by = group]
.. which is pretty clunky and yields a flat matrix that needs dplyr::gather
or reshape2::melt
to reformat.
Surely R can do better than this??
Cumulative means "how much so far". Think of the word "accumulate" which means to gather together. To have cumulative totals, just add up the values as you go.
A running total is the cumulative sum of a value and all previous values in the column. For example, imagine you are in sales and storing information about the number of items sold on a particular day. You might want to calculate a running total, the total number of items sold up to a specific date.
Cumulative sums, or running totals, are used to display the total sum of data as it grows with time (or any other series or progression). This lets you view the total contribution so far of a given measure against time.
If you just want cumulative sums per group, then you can do
transform(d, new=ave(value,group,FUN=cumsum))
with base R.
This should work
library(dplyr)
d %>%
group_by(group) %>%
arrange(date) %>%
mutate(Total = cumsum(value))
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With