I have a data frame that looks like:
df<-data.frame(id=c("xx33","xx33","xx22","xx11","xx11","xx00"),amount=c(10,15,100,20,10,15),date=c("01/02/2013","01/02/2013","02/02/2013","03/03/2013","03/03/2013","04/04/2013"))
id amount date
1 xx33 10 01/02/2013
2 xx33 15 01/02/2013
3 xx22 100 02/02/2013
4 xx11 20 03/03/2013
5 xx11 10 03/03/2013
6 xx00 15 04/04/2013
I want to compile all the common IDs and sum the amount and also the number of occurances of the id, but also carry the common information such as date which is the same for each id (along with any other variable). So, I want the output to be:
id sum date number
1 xx33 25 01/02/2013 2
2 xx22 100 02/02/2013 1
3 xx11 30 03/03/2013 2
4 xx00 15 04/04/2013 1
I've tried
ddply(.data = df, .var = "id", .fun = nrow)
and that returns the total number of occurances but I can't figure out a way to sum the all the common ids without looping.
Using the data.table
library -
library(data.table)
dt <- data.table(df)
dt2 <- dt[,list(sumamount = sum(amount), freq = .N), by = c("id","date")]
Output:
> dt2
id date sumamount freq
1: xx33 01/02/2013 25 2
2: xx22 02/02/2013 100 1
3: xx11 03/03/2013 30 2
4: xx00 04/04/2013 15 1
Here's an R base solution
> cbind(aggregate(amount~id+date, sum, data=df), table(df$id))[, -4]
id date amount Freq
1 xx33 01/02/2013 25 1
2 xx22 02/02/2013 100 2
3 xx11 03/03/2013 30 1
4 xx00 04/04/2013 15 2
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With