I'd like to do the equivalent of the following, but with data.table's "by":
dt <- data.table(V1=rnorm(100), V2=rnorm(100), V3=rnorm(100), ...
group=rbinom(100,2,.5))
dt.agg <- aggregate(dt, by=list(dt$group), FUN=mean)
I know that I could do this:
dt.agg <- dt[, list(V1=mean(V1), V2=mean(V2), V3=mean(V3)), by=group]
But for the case I'm considering I have 100 or so columns V1-V100 (and I always want to aggregate all of them by a single factor, as in aggregate above) so the data.table solution I've got above isn't feasible.
dt[, lapply(.SD, mean), by=group]
To specifiy columns:
dt[,...,by=group, .SDcols=c("V1", "V2", "V3", ...)]
dt[,...,by=group, .SDcols=names(dt)[1:100]]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With