Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pass variable name as argument inside data.table

Tags:

r

data.table

I'm trying to create a function that modifies a data.table and wanted to use some non-standard evaluation but I realised that I don't really know how to work with it inside data.tables. My function is basically something like this:

do_stuff <- function(dt, col) {
  copy(dt)[, new_col := some_fun(col)][]
}

and I want to call it thus:

do_stuff(data, column)

Where "column" is the name of the column that exists inside "data". If I run that function I get an error:

#> Error in some_fun(col) : object 'column' not found 

Which says to me that data.table is apparently passing the correct name to the function ("column") but for some reason it's not finding it. Here's a minimal reproducible example

library(data.table)

data <- data.table(x = 1:10, y = rnorm(10))

plus <- function(x, y) {
   x + y
}

add_one <- function(data, col) {
   copy(data)[, z := plus(col, 1)][]
}

add_one(data, y)
#> Error in plus(col, 1): object 'y' not found

Using deparse(substitute(col)) doesn't seem to work, unfortunately :(

add_one <- function(data, col) {
   copy(data)[, z := plus(deparse(substitute(col)), 1)][]
}

add_one(data, y)
#> Error in x + y: non-numeric argument to binary operator
like image 360
Elio Campitelli Avatar asked Aug 06 '19 15:08

Elio Campitelli


People also ask

What type of arguments can a function take in R?

Function arguments in R can have default values. Default arguments can even be defined in terms of variables created within the function. This is used frequently in base R functions, but I think it is bad practice, because you can't understand what the default values will be without reading the complete source code.

How do I find the name of a variable in R?

You can use ls() to list all variables that are created in the environment. Use ls() to display all variables. pat = " " is used for pattern matching such as ^, $, ., etc.

What is data table in R?

data.table is an R package that provides an enhanced version of data.frame s, which are the standard data structure for storing data in base R. In the Data section above, we already created a data.table using fread() . We can also create one using the data.table() function.


2 Answers

Another option, quoting the column name and using get:

add_one <- function(data, col) {
  copy(data)[, z := plus(get(col), 1)][]
}

add_one(data, "y")
like image 62
arg0naut91 Avatar answered Nov 02 '22 12:11

arg0naut91


Generally, quote and eval will work:

library(data.table)
plus <- function(x, y) {
   x + y
}

add_one <- function(data, col) {
   expr0 = quote(copy(data)[, z := plus(col, 1)][])

   expr  = do.call(substitute, list(expr0, list(col = substitute(col))))
   cat("Evaluated expression:\n"); print(expr); cat("\n")

   eval(expr)
}

set.seed(1)
library(magrittr)
data.table(x = 1:10, y = rnorm(10)) %>% 
   add_one(y)

which gives

Evaluated expression:
copy(data)[, `:=`(z, plus(y, 1))][]

     x          y         z
 1:  1 -0.6264538 0.3735462
 2:  2  0.1836433 1.1836433
 3:  3 -0.8356286 0.1643714
 4:  4  1.5952808 2.5952808
 5:  5  0.3295078 1.3295078
 6:  6 -0.8204684 0.1795316
 7:  7  0.4874291 1.4874291
 8:  8  0.7383247 1.7383247
 9:  9  0.5757814 1.5757814
10: 10 -0.3053884 0.6946116
like image 15
Frank Avatar answered Nov 02 '22 12:11

Frank