The new curly curly method of tidy evaluation is explained in this article. Several examples are given demonstrating the use of this style of non-standard evaluation (NSE).
library(tidyverse)
# Example 1 --------------------------
max_by <- function(data, var, by) {
data %>%
group_by({{ by }}) %>%
summarise(maximum = max({{ var }}, na.rm = TRUE))
}
starwars %>% max_by(height)
starwars %>% max_by(height, by = gender)
# Example 2 --------------------------
summarise_by <- function(data, ..., by) {
data %>%
group_by({{ by }}) %>%
summarise(...)
}
starwars %>%
summarise_by(average = mean(height, na.rm = TRUE),
maximum = max(height, na.rm = TRUE),
by = gender)
I created some of my own functions and this is indeed a lot easier framework to develop in, instead of worrying about all the quosures and bangs and all of that.
However, this same article explains that we're not completely out of the woods yet:
You only need quote-and-unquote (with the plural variants enquos() and !!!) when you need to modify the inputs or their names in some way.
... and no example is provided. Not complaining, just asking if somebody can fill in the gap and provide an example. Not being fluent in Tidy evaluation, I really don't understand what the author is getting at with that quote (pardon the pun).
Tidy evaluation is a framework for controlling how expressions and variables in your code are evaluated by tidyverse functions. This framework, housed in the rlang package, is a powerful tool for writing more efficient and elegant code.
rlang is a toolkit for working with core R and Tidyverse features, and hosts the tidy evaluation framework. The full set of changes can be found in the changelog.
Say you want a version of the following function that takes multiple inputs instead of just a single var
:
mean_by <- function(data, var, by) {
data %>%
group_by({{ by }}) %>%
summarise(average = mean({{ var }}, na.rm = TRUE))
}
You can't just pass ...
to summarise, because then the user needs to call mean()
themselves.
mean_by <- function(data, var, ..., by) {
data %>%
group_by({{ by }}) %>%
summarise(...)
}
mtcars %>% mean_by(foo = disp)
#> Error: Column `foo` must be length 1 (a summary value), not 32
mtcars %>% mean_by(foo = mean(disp))
#> # A tibble: 1 x 1
#> foo
#> <dbl>
#> 1 231.
The solution is to quote the dots, modify each of the inputs so they are wrapped in a new call to mean()
, and then splice them back:
mean_by <- function(data, ..., by) {
# `.named` makes sure the dots have default names, if not supplied
dots <- enquos(..., .named = TRUE)
# Go over all inputs, and wrap them in a call
dots <- lapply(dots, function(dot) call("mean", dot, na.rm = TRUE))
# Finally, splice the expressions back into `summarise()`:
data %>%
group_by({{ by }}) %>%
summarise(!!!dots)
}
We are considering how we could improve syntax for this case. Early thoughts at http://rpubs.com/lionel-/superstache
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With