Note: The precise problem I hit in this question does not apply to recent versions of data table. If you want to do something like described in the title, check out the corresponding question in the package FAQ, 1.6 OK, but I don’t know the expressions in advance. How do I programatically pass them in?.
I have seen an answer that illustrates how to construct an expression to be evaluated in
DT[,j=eval(expr)]
I am using this with an assignment, ```:=`(mycol=my_calculation)``, and I'm wondering...
By "dynamically", I mean "determined after I write the code for my expr
".
EDIT: To better illustrate the issue, here is different example. Look in the edit history to see the original.
require(data.table)
require(plyr)
options(datatable.verbose=TRUE)
DT <- CJ(a=0:1,b=0:1,y=2)
# setup:
expr <- as.quoted(paste(expression(get(col_in_one)+get(col_in_two))))[[1]]
# usage:
col_in_one <- 'a'
col_in_two <- 'b'
col_out <- 'bah'
DT[,(col_out):=eval(expr)] # fails, should take the form j=eval(expr)
I want to keep the setup and usage stages separate, so my code is easier to maintain. My real expression is messier than this example (where it just chooses one column).
First question: How can I make the assigned-to column, "col_out", dynamic? I mean: I want to specify both "cols_in_*" and "col_out" on the fly.
I have tried creating various expressions in "expr", but as.quoted
throws an error about not putting certain stuff to the left of the =
symbol.
Second question: How can I avoid the warnings against using
get
?
The warnings suggest using .SDcols
, to let [.data.table
know which columns I am using. However, if I use the .SDcols
argument, another warning says there's no point doing that unless .SD
is being used.
The solutions I have so far are...
# Ricardo + eddi:
expr2 <- as.quoted(paste(expression(`:=`(
Vtmp=.SD[[col_in_one]]+.SD[[col_in_two]]))))[[1]]
# usage
col_in_one <- 'a'
col_in_two <- 'b'
col_out <- 'bah'
DT[,eval(expr2),.SDcols=c(col_in_one,col_in_two)]
setnames(DT,'Vtmp',col_out)
This still involves the minor annoyance of doing the operation in two steps and keeping track of "Vtmp", so the first question is still partly open.
Maybe I don't understand the problem well, but does this suffice:
DT[, (col_out) := .SD[[col_in_one]]+.SD[[col_in_two]],
.SDcols = c(col_in_one,col_in_two)]
DT
# a b y bah
#1: 0 0 2 0
#2: 0 1 2 1
#3: 1 0 2 1
#4: 1 1 2 2
To answer the edited question, to get the eval
to work, use .SD
as environment:
DT[, (col_out) := eval(expr, .SD)]
Also, see this question and the update there - eval and quote in data.table
The simplest way is to set it AFTER you evaluate expression. Afterall, the time to execute that is constant and nearly 0.
someDummyVar <- "tempColName_XCWF5D"
DT [, (someDummyVar) := eval(expr) ]
setnames(DT, someDummyVar, RealColumnName)
As for question two: Don't turn on verbose warnings and you wont get verbose warnings ;)
options(datatable.verbose=FALSE)
As for Reduce
: try posting that as a separate and simplified question so that it is easier to follow what you are doing (outside of the eval
issues)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With