I would like to scale
a subset of columns in my data.table
. There are many of these that I would like to scale
so i want to avoid specifying them all by name. The columns that are not being scaled, I would just like to return as is. Here is what I was hoping would work but it does not:
require(data.table)
x = data.table(id=1:10, a=sample(1:10,10), b=sample(1:10,10), c=sample(1:10,10))
> dput(x)
structure(list(id = 1:10, a = c(1L, 6L, 10L, 7L, 5L, 3L, 2L,
4L, 9L, 8L), b = c(4L, 9L, 5L, 7L, 6L, 1L, 8L, 10L, 3L, 2L),
c = c(2L, 7L, 5L, 6L, 4L, 1L, 10L, 9L, 8L, 3L)), .Names = c("id",
"a", "b", "c"), row.names = c(NA, -10L), class = c("data.table",
"data.frame"), .internal.selfref = <pointer: 0x1a85d088>)
sx = x[,c(id, lapply(.SD, function(v) as.vector(scale(v)))), .SDcols = colnames(x)[2:4]]
Error in eval(expr, envir, enclos) : object 'id' not found
Any suggestions?
You could also assign by reference in a copy of the data table
sc <- names(x)[2:4]
sx <- copy(x)[ , (sc) := as.data.table(scale(.SD)), .SDcols = sc]
scale returns a matrix and iirc data.table doesn't like matrix columns.
Or,
sx <- copy(x)[ , (sc) := lapply(.SD,scale), .SDcols = sc]
[ The brackets around (sc)
tell data.table
to take the LHS value from the value of the variable in calling scope rather than the column name sc
itself. ]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With