I have a data.table with one key and about 100 numeric rows, one of which is set to key. I would like to create a new variable that contains summation of each numeric rows, grouped by key.
For example, my data right now is
ID Count1 Count2 Count3 1 1 3 0 1 3 3 3 2 1 2 1 3 1 1 2
What I would like to have is:
ID Count1 Count2 Count3 1 4 6 3 2 1 2 1 3 1 1 2
I have tried so many ways to get this. I know I can do:
Y <- X[, list(Count=sum(Count1), Count2=sum(Count2), Count3=sum(Count3), by = ID]
However, I have hundreds of variables, and I only get their names on a list. How should I go about handling this?
Thanks a lot for your help.
Here is a code to generate test data:
ID <-c(rep(210, 9), rep(3917,6)) Count1 <- c(1,1,0,1,3,1,4,1,1,1,1,1,1,0,1) Count2 <- c(1,0,0,1,0,1,0,1,1,1,1,1,1,0,1) Count3 <- c(1,0,0,1,0,1,0,1,1,1,1,1,1,0,1) x <- data.table(ID, Count1, Count2, Count3) setkey(x, ID)
Your test data doesn't match the example you gave, but regardless - you can take advantage of the fact that data.table()
has an operator named .SD
for "subset of data. So this should work:
x[, lapply(.SD, sum), by = ID] #---- ID Count Count2 Count3 1: 210 13 5 5 2: 3917 5 5 5
This is actually covered in the FAQ: type vignette("datatable-faq", package="data.table")
or find it online.
As a data.table is a data.frame, you can use aggregate for this:
> aggregate(. ~ ID, data=x, FUN=sum) ID Count1 Count2 Count3 1 210 13 5 5 2 3917 5 5 5
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With