Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Sum values from common ID in a dataframe

Tags:

r

aggregate

I have a data frame that looks like:

df<-data.frame(id=c("xx33","xx33","xx22","xx11","xx11","xx00"),amount=c(10,15,100,20,10,15),date=c("01/02/2013","01/02/2013","02/02/2013","03/03/2013","03/03/2013","04/04/2013"))

    id  amount  date
1   xx33    10  01/02/2013
2   xx33    15  01/02/2013
3   xx22    100 02/02/2013
4   xx11    20  03/03/2013
5   xx11    10  03/03/2013
6   xx00    15  04/04/2013

I want to compile all the common IDs and sum the amount and also the number of occurances of the id, but also carry the common information such as date which is the same for each id (along with any other variable). So, I want the output to be:

    id  sum date        number
1   xx33    25  01/02/2013   2
2   xx22    100 02/02/2013   1
3   xx11    30  03/03/2013   2
4   xx00    15  04/04/2013   1

I've tried

ddply(.data = df, .var = "id", .fun = nrow)

and that returns the total number of occurances but I can't figure out a way to sum the all the common ids without looping.

like image 681
John M. Avatar asked Oct 25 '13 18:10

John M.


2 Answers

Using the data.table library -

library(data.table)
dt <- data.table(df)
dt2 <- dt[,list(sumamount = sum(amount), freq = .N), by = c("id","date")]

Output:

> dt2
     id       date sumamount freq
1: xx33 01/02/2013        25    2
2: xx22 02/02/2013       100    1
3: xx11 03/03/2013        30    2
4: xx00 04/04/2013        15    1
like image 182
TheComeOnMan Avatar answered Oct 01 '22 07:10

TheComeOnMan


Here's an R base solution

> cbind(aggregate(amount~id+date, sum, data=df), table(df$id))[, -4]
    id       date amount Freq
1 xx33 01/02/2013     25    1
2 xx22 02/02/2013    100    2
3 xx11 03/03/2013     30    1
4 xx00 04/04/2013     15    2
like image 23
Jilber Urbina Avatar answered Oct 01 '22 06:10

Jilber Urbina