Consider this data.table:
dt <- data.table(V1=c(3,2),V2=c(4,2),V3=c(6,5),V4=c(7,8),V5=c(9,10))
> dt
V1 V2 V3 V4 V5
1: 3 4 6 7 9
2: 2 2 5 8 10
I would like to create a new column that pastes the values and names of a subset of columns. The original columns should be retained.
The desired output is:
> cols <- paste0("V",2:3)
> dt
V1 V2 V3 V4 V5 merged
1: 3 4 6 7 9 V2:4 V3:6
2: 2 2 5 8 10 V2:2 V3:5
I was able to get the output by sending .SD to a function and going through each row:
getSub <- function(SD){
lapply(1:nrow(SD), function(s) paste0(cols,":", SD[s], collapse=" "))
}
dt[, merged := list(getSub(.SD)), .SDcols=cols]
> dt
V1 V2 V3 V4 V5 merged
1: 3 4 6 7 9 V1:3 V2:4 V3:6 V4:7
2: 2 2 5 8 10 V1:2 V2:2 V3:5 V4:8
I was wondering if there is a way to use something like dt[, merged := list(do.call(paste,c(c(cols,.SD),sep=" "))), .SDcols=cols] since it appears to be much faster https://stackoverflow.com/a/23819177/2070021?
Try this:
dt[, merged := do.call(paste, Map(function(x, y) paste(x, y, sep = ':'),
names(.SD), .SD)),
.SDcols = cols]
Another option is constructing the expression and evaluating it, but the above seems fast enough not to bother with that mess.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With