Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

r - passing variables as data.table column names

Tags:

r

data.table

The more I use it, the more data.table is replacing dplyr as my 'goto' package as the speed it offers is a big plus.

Question

Can you pass variables to i in data.table (dt[i,j]) withouth creating an expression?

Example

Given a data.table:

library(data.table)
dt <- data.table(val1 = c(1,2,3),
                 val2 = c(3,2,1))

I would like to evalulate:

dt[(val1 > val2)]

but using a variable to reference the column names. For example,

myCol <- c("val1", "val2")  ## vector of column names

I've read a lots of questions that show ways of doing this with expressions:

## create an expression to evaluate
expr <- parse(text = paste0(myCol[1], " > ", myCol[2]))

## evaluate expression
dt[(eval(expr))]

   val1 val2
1:    3    1

But I was wondering if there is a more 'direct' way to do this that I've missed, something akin to:

dt[(myCol[1] > myCol[2])] 

Or is the expression route the way this should be done?

like image 548
tospig Avatar asked Oct 08 '15 03:10

tospig


1 Answers

We can use eval(as.name(..

dt[eval(as.name(myCol[1]))> eval(as.name(myCol[2]))]

Or we can specify it in the .SDcols

dt[dt[, .I[.SD[[1]]> .SD[[2]]], .SDcols= myCol]]

Or an option using get by @thelatemail

dt[get(myCol[1]) > get(myCol[2])]

If there are only two elements, we can also use Reduce with mget (a slight variation of @thelatemail's answer)

dt[Reduce('>', mget(myCol))]
like image 55
akrun Avatar answered Oct 26 '22 01:10

akrun