Suppose I have a data.table as follows -:
data = data.table(c("a","a","b","b","c"),c(1,2,3,4,5))
I would like to sum the numeric vector, only when the factor vector has more than one entry. The problem I have will require the use of .SD. I understand that I could create a N field via
data[ , N := .N, by = V1]
and then sum via
data[N > 1, lapply(.SD,sum), by = V1, .SDcols = 2]
However, is there a one step call to do this?
Referencing .SD in the call doesn't return an answer -
data[, lapply(.SD[which(length(.SD)>1)],sum), by = V1, .SDcols = 2]
I would like to understand why this doesn't work. Neither does -:
data[, lapply(.SD[which(.N>1)],sum), by = V1, .SDcols = 2]
Thanks!
data <- data.table(c("a","a","b","b","c"),c(1,2,3,4,5))
data[, if(.N > 1) lapply(.SD, sum) else NULL, by=V1]
# V1 V2
# 1: a 3
# 2: b 7
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With