Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

column selection with a function in a j-environment

Tags:

r

data.table

Consider the following column selection in a data.table:

library(data.table) # using 1.8.7 from r-forge
dt <- data.table(a = 1:5, b = i <- rnorm(5), c = pnorm(i))
dt[, list(a,b)]  #ok

To streamline my code in certain computations with many and variable columns I want to replace list(a,b) with a function. Here is a first try:

.ab <- function()  quote(list(a, b))
dt[, eval(.ab())] #ok - same as above

Ideally, I would like to get rid of eval() from the [.data.table call and confine it to the definition of .ab while at the same time avoid passing the data table dt to the function .ab.

.eab <- function()  eval(quote(list(a, b)))
dt[, .eab()] 
# Error in eval(expr, envir, enclos) : object 'b' not found

What's happening? How can this be fixed?

I suspect what's biting me is R's lexical scoping and the fact that the correct evaluation of list(a,b) relies on it being within the J environment of the data table dt. Alas, I don't know how to fetch a reference to the correct environment and use it as an envir or enclos argument in dt.

# .eab <- function()  eval(quote(list(a, b)), envir = ?, enclos = ?)

EDIT

This approach almost works:

.eab <- function(e)  eval(quote(list(a, b)), envir = e)
dt[, .eab(dt)]

There are two shortcomings: (1) column names are not returned, (2) dt has to be passed explicitly (which i'd rather avoid). I would also rather avoid hardcoding dt as the choice environment. These consideration lead an alternative way of asking the above question: is there a programmatic way to get the environment dt from within .eab?

like image 965
Ryogi Avatar asked Feb 01 '13 20:02

Ryogi


1 Answers

The intention is to create an expression rather than a function.

DT[, list(a,b), by=...]  # ok

.ab = quote(list(a, b))    # simpler here, no need for function()

DT[, eval(.ab), by=...]  # same

This approach is one reason grouping is fast in data.table: j is evaluated in a static environment for all groups so the (small) overhead of each function call can be avoided.

But if .ab really needs to be a function for some reason, then we can certainly give it further thought.

like image 153
Matt Dowle Avatar answered Sep 28 '22 03:09

Matt Dowle