Say I have the following
library(data.table)
cars1 = setDT(copy(cars))
cars2 = setDT(copy(cars))
car_list = list(cars1, cars2)
class(car_list) <- "dd"
`[.dd` <- function(x,...) {
code = rlang::enquos(...)
cars1 = x[[1]]
rlang::eval_tidy(quo(cars1[!!!code]))
}
car_list[,.N, by = speed]
so I wished to perform arbitrary operations on cars1
and cars2
by defining the [.dd
function so that whatever I put into ...
get executed by cars1
and cars2
using the [
data.table syntax e.g.
car_list[,.N, by = speed]
should perform the following
cars1[,.N, by = speed]
cars2[,.N, by = speed]
also I want
car_list[,speed*2]
to do
cars1[,speed*2]
cars2[,speed*2]
Basically, ...
in [.dd
has to accept arbitrary code.
somehow I need to capture the ...
so I tried to do code = rlang::enquos(...)
and then rlang::eval_tidy(quo(cars1[!!!code]))
doesn't work and gives error
Error in
[.data.table
(cars1, ~, ~.N, by = ~speed) : argument "i" is missing, with no default
As the name suggests, non-standard evaluation breaks away from the standard evaluation (SE) rules in order to do something special. There are three common uses of NSE: Labelling enhances plots and tables by using the expressions supplied to a function, rather than their values.
Tidy evaluation is a framework for controlling how expressions and variables in your code are evaluated by tidyverse functions. This framework, housed in the rlang package, is a powerful tool for writing more efficient and elegant code.
While not under rlang
type of mantra, this approach seems to work pretty well: lapply(dt_list, '[', ...)
The code would be more readable to me as it is explicit about what method is being used. If I saw car_list[, .N, by = speed]
I would expect the default data.table
methods.
Making it as a function allows you to have the best of both worlds:
class(car_list) <- "dd"
`[.dd` <- function(x,...) {
lapply(x, '[', ...)
}
car_list[, .N, speed]
car_list[, speed * 2]
car_list[, .(.N, max(dist)), speed]
car_list[, `:=` (more_speed = speed+5)]
Here are some examples of the approach:
car_list[, .N, speed]
# lapply(car_list, '[', j = .N, by = speed)
# or
# lapply(car_list, '[', , .N, speed)
[[1]]
speed N
1: 4 2
2: 7 2
3: 8 1
4: 9 1
5: 10 3
...
[[2]]
speed N
1: 4 2
2: 7 2
3: 8 1
4: 9 1
5: 10 3
...
car_list[, speed * 2]
# lapply(car_list, '[', j = speed*2)
# or
# lapply(car_list, '[', , speed*2)
[[1]]
[1] 8 8 14 14 16 18 20 20 20 22 22 24 24 24 24 26 26
[18] 26 26 28 28 28 28 30 30 30 32 32 34 34 34 36 36 36
[35] 36 38 38 38 40 40 40 40 40 44 46 48 48 48 48 50
[[2]]
[1] 8 8 14 14 16 18 20 20 20 22 22 24 24 24 24 26 26
[18] 26 26 28 28 28 28 30 30 30 32 32 34 34 34 36 36 36
[35] 36 38 38 38 40 40 40 40 40 44 46 48 48 48 48 50
car_list[, .(.N, max(dist)), speed]
# lapply(car_list, '[', j = list(.N, max(dist)), by = speed)
# or
# lapply(car_list, '[', ,.(.N, max(dist)), speed)
[[1]]
speed N V2
1: 4 2 10
2: 7 2 22
3: 8 1 16
4: 9 1 10
5: 10 3 34
...
[[2]]
speed N V2
1: 4 2 10
2: 7 2 22
3: 8 1 16
4: 9 1 10
5: 10 3 34
...
This works with the :=
operator:
car_list[, `:=` (more_speed = speed+5)]
# or
# lapply(car_list, '[', , `:=` (more_speed = speed+5))
car_list
[[1]]
speed dist more_speed
1: 4 2 9
2: 4 10 9
3: 7 4 12
4: 7 22 12
5: 8 16 13
...
[[2]]
speed dist more_speed
1: 4 2 9
2: 4 10 9
3: 7 4 12
4: 7 22 12
5: 8 16 13
First base R option is substitute(...())
followed by do.call
:
library(data.table)
cars1 = setDT(copy(cars))
cars2 = setDT(copy(cars))
cars2[, speed := sort(speed, decreasing = TRUE)]
car_list = list(cars1, cars2)
class(car_list) <- "dd"
`[.dd` <- function(x,...) {
a <- substitute(...()) #this is an alist
expr <- quote(x[[i]])
expr <- c(expr, a)
res <- list()
for (i in seq_along(x)) {
res[[i]] <- do.call(data.table:::`[.data.table`, expr)
}
res
}
all.equal(
car_list[,.N, by = speed],
list(cars1[,.N, by = speed], cars2[,.N, by = speed])
)
#[1] TRUE
all.equal(
car_list[, speed*2],
list(cars1[, speed*2], cars2[, speed*2])
)
#[1] TRUE
Second base R option is match.call
, modify the call and then evaluate (you find this approach in lm
):
`[.dd` <- function(x,...) {
thecall <- match.call()
thecall[[1]] <- quote(`[`)
thecall[[2]] <- quote(x[[i]])
res <- list()
for (i in seq_along(x)) {
res[[i]] <- eval(thecall)
}
res
}
all.equal(
car_list[,.N, by = speed],
list(cars1[,.N, by = speed], cars2[,.N, by = speed])
)
#[1] TRUE
all.equal(
car_list[, speed*2],
list(cars1[, speed*2], cars2[, speed*2])
)
#[1] TRUE
I haven't tested if these approaches will make a deep copy if you use :=
.
The suggestion in my comment wasn't complete.
You can indeed use rlang
to support tidy evaluation,
but since data.table
itself doesn't support it directly,
you're better off using expressions instead of quosures,
and you need to build the complete final expression before calling eval_tidy
:
`[.dd` <- function(x, ...) {
code <- rlang::enexprs(...)
lapply(x, function(dt) {
ex <- rlang::expr(dt[!!!code])
rlang::eval_tidy(ex)
})
}
car_list[, .N, by = speed]
[[1]]
speed N
1: 4 2
2: 7 2
3: 8 1
4: 9 1
5: 10 3
6: 11 2
7: 12 4
8: 13 4
9: 14 4
10: 15 3
11: 16 2
12: 17 3
13: 18 4
14: 19 3
15: 20 5
16: 22 1
17: 23 1
18: 24 4
19: 25 1
[[2]]
speed N
1: 4 2
2: 7 2
3: 8 1
4: 9 1
5: 10 3
6: 11 2
7: 12 4
8: 13 4
9: 14 4
10: 15 3
11: 16 2
12: 17 3
13: 18 4
14: 19 3
15: 20 5
16: 22 1
17: 23 1
18: 24 4
19: 25 1
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With