Say I have the following <pre class="prettyprint lang-r prettyprint-override"><code>library(data.table) cars1 = setDT(copy(cars)) cars2 = setDT(copy(cars)) car_list = list(cars1, cars2) class(car_list) <- "dd" `[.dd` <- function(x,...) { code = rlang::enquos(...) cars1 = x[[1]] rlang::eval_tidy(quo(cars1[!!!code])) } car_list[,.N, by = speed] </code></pre> so I wished to perform arbitrary operations on <code>cars1</code> and <code>cars2</code> by defining the <code>[.dd</code> function so that whatever I put into <code>...</code> get executed by <code>cars1</code> and <code>cars2</code> using the <code>[</code> data.table syntax e.g. <code>car_list[,.N, by = speed]</code> should perform the following <pre class="prettyprint lang-r prettyprint-override"><code>cars1[,.N, by = speed] cars2[,.N, by = speed] </code></pre> also I want <pre class="prettyprint lang-r prettyprint-override"><code>car_list[,speed*2] </code></pre> to do <pre class="prettyprint lang-r prettyprint-override"><code>cars1[,speed*2] cars2[,speed*2] </code></pre> Basically, <code>...</code> in <code>[.dd</code> has to accept arbitrary code. somehow I need to capture the <code>...</code> so I tried to do <code>code = rlang::enquos(...)</code> and then <code>rlang::eval_tidy(quo(cars1[!!!code]))</code> doesn't work and gives error <blockquote> Error in <code>[.data.table</code>(cars1, ~, ~.N, by = ~speed) : argument "i" is missing, with no default </blockquote>

First base R option is <code>substitute(...())</code> followed by <code>do.call</code>: <pre class="prettyprint"><code>library(data.table) cars1 = setDT(copy(cars)) cars2 = setDT(copy(cars)) cars2[, speed := sort(speed, decreasing = TRUE)] car_list = list(cars1, cars2) class(car_list) <- "dd" `[.dd` <- function(x,...) { a <- substitute(...()) #this is an alist expr <- quote(x[[i]]) expr <- c(expr, a) res <- list() for (i in seq_along(x)) { res[[i]] <- do.call(data.table:::`[.data.table`, expr) } res } all.equal( car_list[,.N, by = speed], list(cars1[,.N, by = speed], cars2[,.N, by = speed]) ) #[1] TRUE all.equal( car_list[, speed*2], list(cars1[, speed*2], cars2[, speed*2]) ) #[1] TRUE </code></pre> Second base R option is <code>match.call</code>, modify the call and then evaluate (you find this approach in <code>lm</code>): <pre class="prettyprint"><code>`[.dd` <- function(x,...) { thecall <- match.call() thecall[[1]] <- quote(`[`) thecall[[2]] <- quote(x[[i]]) res <- list() for (i in seq_along(x)) { res[[i]] <- eval(thecall) } res } all.equal( car_list[,.N, by = speed], list(cars1[,.N, by = speed], cars2[,.N, by = speed]) ) #[1] TRUE all.equal( car_list[, speed*2], list(cars1[, speed*2], cars2[, speed*2]) ) #[1] TRUE </code></pre> I haven't tested if these approaches will make a deep copy if you use <code>:=</code>.

The suggestion in my comment wasn't complete. You can indeed use <code>rlang</code> to support tidy evaluation, but since <code>data.table</code> itself doesn't support it directly, you're better off using expressions instead of quosures, and you need to build the complete final expression before calling <code>eval_tidy</code>: <pre class="prettyprint"><code>`[.dd` <- function(x, ...) { code <- rlang::enexprs(...) lapply(x, function(dt) { ex <- rlang::expr(dt[!!!code]) rlang::eval_tidy(ex) }) } car_list[, .N, by = speed] [[1]] speed N 1: 4 2 2: 7 2 3: 8 1 4: 9 1 5: 10 3 6: 11 2 7: 12 4 8: 13 4 9: 14 4 10: 15 3 11: 16 2 12: 17 3 13: 18 4 14: 19 3 15: 20 5 16: 22 1 17: 23 1 18: 24 4 19: 25 1 [[2]] speed N 1: 4 2 2: 7 2 3: 8 1 4: 9 1 5: 10 3 6: 11 2 7: 12 4 8: 13 4 9: 14 4 10: 15 3 11: 16 2 12: 17 3 13: 18 4 14: 19 3 15: 20 5 16: 22 1 17: 23 1 18: 24 4 19: 25 1 </code></pre>

How to use non-standard evaluation NSE to evaluate arguments on data.table?

Tags:

r

data.table

nse

Say I have the following

library(data.table)
cars1 = setDT(copy(cars))
cars2 = setDT(copy(cars))

car_list = list(cars1, cars2)
class(car_list) <- "dd"

`[.dd` <- function(x,...) {
  code = rlang::enquos(...)
  cars1 = x[[1]]
  rlang::eval_tidy(quo(cars1[!!!code]))
}

car_list[,.N, by = speed]

so I wished to perform arbitrary operations on cars1 and cars2 by defining the [.dd function so that whatever I put into ... get executed by cars1 and cars2 using the [ data.table syntax e.g.

car_list[,.N, by = speed] should perform the following

cars1[,.N, by = speed]
cars2[,.N, by = speed]

also I want

car_list[,speed*2]

to do

cars1[,speed*2]
cars2[,speed*2]

Basically, ... in [.dd has to accept arbitrary code.

somehow I need to capture the ... so I tried to do code = rlang::enquos(...) and then rlang::eval_tidy(quo(cars1[!!!code])) doesn't work and gives error

Error in [.data.table(cars1, ~, ~.N, by = ~speed) : argument "i" is missing, with no default

936

asked Jul 20 '19 08:07

xiaodai

3 Answers

While not under rlang type of mantra, this approach seems to work pretty well: lapply(dt_list, '[', ...) The code would be more readable to me as it is explicit about what method is being used. If I saw car_list[, .N, by = speed] I would expect the default data.table methods.

Making it as a function allows you to have the best of both worlds:

class(car_list) <- "dd"

`[.dd` <- function(x,...) {
 lapply(x, '[', ...)
}

car_list[, .N, speed]
car_list[, speed * 2]
car_list[, .(.N, max(dist)), speed]
car_list[, `:=` (more_speed = speed+5)]

Here are some examples of the approach:

car_list[, .N, speed]
# lapply(car_list, '[', j = .N, by = speed)
# or
# lapply(car_list, '[', , .N, speed)
[[1]]
    speed N
 1:     4 2
 2:     7 2
 3:     8 1
 4:     9 1
 5:    10 3
...
[[2]]
    speed N
 1:     4 2
 2:     7 2
 3:     8 1
 4:     9 1
 5:    10 3
...
car_list[, speed * 2]
# lapply(car_list, '[', j = speed*2)
# or
# lapply(car_list, '[', , speed*2)
[[1]]
 [1]  8  8 14 14 16 18 20 20 20 22 22 24 24 24 24 26 26
[18] 26 26 28 28 28 28 30 30 30 32 32 34 34 34 36 36 36
[35] 36 38 38 38 40 40 40 40 40 44 46 48 48 48 48 50

[[2]]
 [1]  8  8 14 14 16 18 20 20 20 22 22 24 24 24 24 26 26
[18] 26 26 28 28 28 28 30 30 30 32 32 34 34 34 36 36 36
[35] 36 38 38 38 40 40 40 40 40 44 46 48 48 48 48 50

car_list[, .(.N, max(dist)), speed]
# lapply(car_list, '[', j = list(.N, max(dist)), by = speed)
# or 
# lapply(car_list, '[', ,.(.N, max(dist)), speed)

[[1]]
    speed N  V2
 1:     4 2  10
 2:     7 2  22
 3:     8 1  16
 4:     9 1  10
 5:    10 3  34
...

[[2]]
    speed N  V2
 1:     4 2  10
 2:     7 2  22
 3:     8 1  16
 4:     9 1  10
 5:    10 3  34
...

This works with the := operator:

car_list[, `:=` (more_speed = speed+5)]
# or
# lapply(car_list, '[', , `:=` (more_speed = speed+5))

car_list
[[1]]
    speed dist more_speed
 1:     4    2          9
 2:     4   10          9
 3:     7    4         12
 4:     7   22         12
 5:     8   16         13
...

[[2]]
    speed dist more_speed
 1:     4    2          9
 2:     4   10          9
 3:     7    4         12
 4:     7   22         12
 5:     8   16         13

answered Nov 15 '22 10:11

Cole

First base R option is substitute(...()) followed by do.call:

library(data.table)
cars1 = setDT(copy(cars))
cars2 = setDT(copy(cars))
cars2[, speed := sort(speed, decreasing = TRUE)]

car_list = list(cars1, cars2)
class(car_list) <- "dd"

`[.dd` <- function(x,...) {
  a <- substitute(...()) #this is an alist
  expr <- quote(x[[i]])
  expr <- c(expr, a)
  res <- list()
  for (i in seq_along(x)) {
    res[[i]] <- do.call(data.table:::`[.data.table`, expr)
  }
  res
}

all.equal(
  car_list[,.N, by = speed],
  list(cars1[,.N, by = speed], cars2[,.N, by = speed])
)
#[1] TRUE

all.equal(
  car_list[, speed*2],
  list(cars1[, speed*2], cars2[, speed*2])
)
#[1] TRUE

Second base R option is match.call, modify the call and then evaluate (you find this approach in lm):

`[.dd` <- function(x,...) {
  thecall <- match.call()
  thecall[[1]] <- quote(`[`)
  thecall[[2]] <- quote(x[[i]])
  res <- list()
  for (i in seq_along(x)) {
    res[[i]] <- eval(thecall)
  }
  res
}

all.equal(
  car_list[,.N, by = speed],
  list(cars1[,.N, by = speed], cars2[,.N, by = speed])
)
#[1] TRUE

all.equal(
  car_list[, speed*2],
  list(cars1[, speed*2], cars2[, speed*2])
)
#[1] TRUE

I haven't tested if these approaches will make a deep copy if you use :=.

answered Nov 15 '22 10:11

Roland

The suggestion in my comment wasn't complete. You can indeed use rlang to support tidy evaluation, but since data.table itself doesn't support it directly, you're better off using expressions instead of quosures, and you need to build the complete final expression before calling eval_tidy:

`[.dd` <- function(x, ...) {
  code <- rlang::enexprs(...)
  lapply(x, function(dt) {
    ex <- rlang::expr(dt[!!!code])
    rlang::eval_tidy(ex)
  })
}

car_list[, .N, by = speed]
[[1]]
    speed N
 1:     4 2
 2:     7 2
 3:     8 1
 4:     9 1
 5:    10 3
 6:    11 2
 7:    12 4
 8:    13 4
 9:    14 4
10:    15 3
11:    16 2
12:    17 3
13:    18 4
14:    19 3
15:    20 5
16:    22 1
17:    23 1
18:    24 4
19:    25 1

[[2]]
    speed N
 1:     4 2
 2:     7 2
 3:     8 1
 4:     9 1
 5:    10 3
 6:    11 2
 7:    12 4
 8:    13 4
 9:    14 4
10:    15 3
11:    16 2
12:    17 3
13:    18 4
14:    19 3
15:    20 5
16:    22 1
17:    23 1
18:    24 4
19:    25 1

answered Nov 15 '22 08:11

Alexis

Related questions
                            
                                How do I change a named vector to a data frame retaining the names?
                            
                                glm in python vs R
                            
                                How to call a function for each row of a data.frame?
                            
                                Remove rows in data.table according to another data.table
                            
                                Neatest way to build a data frame from a list of lists in R
                            
                                officer package function for adding an R plot to a presentation
                            
                                choices combination,order & tree
                            
                                R - difference between 2 sets in data frame
                            
                                Is there an efficient way to check whether an R character vector contains the same elements?
                            
                                Resize Embedding Image in Shiny App
                            
                                Reducing number of factor levels before modelling
                            
                                Extract Fiscal Year with R Lubridate
                            
                                Aligning geom_text to geom_jitter points
                            
                                Count number of values which are less than current value
                            
                                rstan C++14 error while installing (centos)
                            
                                Combine two vectors alternately
                            
                                How to match a value in a vector with the one before and after in r?
                            
                                Total revenue generated from a route in R
                            
                                Error: package or namespace load failed for ‘tidyverse’ in loadNamespace
                            
                                rename_if() together with starts_with() to prefix certain columns [duplicate]

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With