So I have a dataset which simplified looks something like this:
Year ID Sum
2009 999 100
2009 123 85
2009 666 100
2009 999 100
2009 123 90
2009 666 85
2010 999 100
2010 123 100
2010 666 95
2010 999 75
2010 123 100
2010 666 85
I'd like to add a column with the cumulative sum, by year and ID. Like this:
Year ID Sum Cum.Sum
2009 999 100 100
2009 123 85 85
2009 666 100 100
2009 999 100 200
2009 123 90 175
2009 666 85 185
2010 999 100 100
2010 123 100 100
2010 666 95 95
2010 999 75 175
2010 123 100 200
2010 666 85 180
I think this should be pretty straight-forward, but somehow I haven't been able to figure it out. How do I do this? Thanks for the help!
Using data.table
:
require(data.table)
DT <- data.table(DF)
DT[, Cum.Sum := cumsum(Sum), by=list(Year, ID)]
Year ID Sum Cum.Sum
1: 2009 999 100 100
2: 2009 123 85 85
3: 2009 666 100 100
4: 2009 999 100 200
5: 2009 123 90 175
6: 2009 666 85 185
7: 2010 999 100 100
8: 2010 123 100 100
9: 2010 666 95 95
10: 2010 999 75 175
11: 2010 123 100 200
12: 2010 666 85 180
Another way
1) use ddply to sum a variable by group (similar to SQL group by)
X <- ddply ( dataset, .(Year,ID), sum)
2) merge the result with dataset
Y <- merge( dataset, X, by=('Year','ID')
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With