Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

data.table drop key rows and summarize

Tags:

r

data.table

I'm looking for an elegant way to iterate over the key of data.table, drop the rows that have that key, then take a summary over the remaining rows. For example:

mydt <- data.table(cat=c("a","a","b","b","c","c","c"), vals = 1:7)
setkey(mydt,cat)
tmp1 <- mydt[!"a"][,mean(vals)]
tmp2 <- mydt[!"b"][,mean(vals)]
tmp3 <- mydt[!"c"][,mean(vals)]
outdt <- data.table(cat=c("a","b","c"),means=c(tmp1,tmp2,tmp3))

Is there a way to loop over the key and do this elegantly? Thanks.

like image 670
Rob Richmond Avatar asked Aug 05 '14 00:08

Rob Richmond


1 Answers

I think this does it, using more traditional data.table code:

setkey(mydt,cat)
mydt[, list(means=mean(mydt[!.BY,vals])), by=cat]

# or without needing to key first
mydt[, list(means=mean(mydt[cat != .BY,vals])), by=cat]

#   cat means
#1:   a   5.0
#2:   b   4.2
#3:   c   2.5
like image 54
thelatemail Avatar answered Nov 15 '22 09:11

thelatemail