Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Loop through columns in a data.table and transform those columns

Tags:

r

data.table

I have a data.table DT with a column named RF and many columns with an underline _in it. I want to loop through all those columns with an underline and subtract the RF column from it. However, I'm stuck. It seems that everything on the RHS of the := operator in a data.table does not work with dynamic variables.

Here is my DT and the desired output (hardcoded):

library(data.table) DT <- data.table(RF  = 1:10,                  S_1 = 11:20,                  S_2 = 21:30) #Desired output DT[ , S_1 := S_1 - RF] DT[ , S_2 := S_2 - RF] DT       RF S_1 S_2  [1,]  1  10  20  [2,]  2  10  20  [3,]  3  10  20 ... 

However, I want this to be more flexible, i.e. loop through every column with "_" in its name and subtract RF:

#1. try: Does not work; Interestingly, the i on the LHS of := is interpreted as the column i, but on the RHS of #:= it is interpreted as 2 and 3, respectively for (i in grep("_", names(DT))){   DT[ , i:= i - 1, with=FALSE] } DT           RF  S_1 S_2  [1,]  1   1   2  [2,]  2   1   2  [3,]  3   1   2 ...  #2. try: Work with parse and eval for (i in grep("_", names(DT), value=TRUE)){   DT[ , eval(parse(text=i)):= eval(parse(text=i)) - RF] } #Error in eval(expr, envir, enclos) : object 'S_1' not found 

Any hints how to do that would be great.

EDIT: As soon as I posted the question, I thought to myself: Why are you working with the := operator in the first place, and sure enough, I just realized I don't have to. This does work and doesn't need a loop:

DT[, grep("_", names(DT)), with=FALSE] - DT[, RF] 

Sorry for that. However, I leave the question open because I'm still interested on why my approach with the := operator doesn't work. So maybe someone can help me there.

like image 620
Christoph_J Avatar asked Dec 04 '11 10:12

Christoph_J


1 Answers

You were on the right track with your second attempt. Here is an approach that uses substitute to build the expression that gets passed in as the 'j' argument in DT[ , j ].

for (i in grep("_", names(DT), value=TRUE)){     e <- substitute(X := X - RF, list(X = as.symbol(i)))     DT[ , eval(e)] } DT #     RF S_1 S_2 # [1,]  1  10  20 # [2,]  2  10  20 # [3,]  3  10  20 # [4,]  4  10  20 # [5,]  5  10  20 

You could also use an LHS expression rather than a symbol :

for (i in grep("_", names(DT), value=TRUE))     DT[, (i) := get(i)-RF] 
like image 51
Josh O'Brien Avatar answered Oct 02 '22 12:10

Josh O'Brien