Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

curly curly Tidy evaluation and modifying inputs or their names

The new curly curly method of tidy evaluation is explained in this article. Several examples are given demonstrating the use of this style of non-standard evaluation (NSE).

library(tidyverse)

# Example 1 --------------------------
max_by <- function(data, var, by) {
  data %>%
    group_by({{ by }}) %>%
    summarise(maximum = max({{ var }}, na.rm = TRUE))
}
starwars %>% max_by(height)
starwars %>% max_by(height, by = gender)

# Example 2 --------------------------
summarise_by <- function(data, ..., by) {
  data %>%
    group_by({{ by }}) %>%
    summarise(...)
}

starwars %>%
  summarise_by(average = mean(height, na.rm = TRUE),
               maximum = max(height, na.rm = TRUE),
               by = gender)

I created some of my own functions and this is indeed a lot easier framework to develop in, instead of worrying about all the quosures and bangs and all of that.

However, this same article explains that we're not completely out of the woods yet:

You only need quote-and-unquote (with the plural variants enquos() and !!!) when you need to modify the inputs or their names in some way.

... and no example is provided. Not complaining, just asking if somebody can fill in the gap and provide an example. Not being fluent in Tidy evaluation, I really don't understand what the author is getting at with that quote (pardon the pun).

like image 527
Display name Avatar asked Jul 08 '19 13:07

Display name


People also ask

What is tidy evaluation?

Tidy evaluation is a framework for controlling how expressions and variables in your code are evaluated by tidyverse functions. This framework, housed in the rlang package, is a powerful tool for writing more efficient and elegant code.

Is rlang part of tidyverse?

rlang is a toolkit for working with core R and Tidyverse features, and hosts the tidy evaluation framework. The full set of changes can be found in the changelog.


1 Answers

Say you want a version of the following function that takes multiple inputs instead of just a single var:

mean_by <- function(data, var, by) {
  data %>%
    group_by({{ by }}) %>%
    summarise(average = mean({{ var }}, na.rm = TRUE))
}

You can't just pass ... to summarise, because then the user needs to call mean() themselves.

mean_by <- function(data, var, ..., by) {
  data %>%
    group_by({{ by }}) %>%
    summarise(...)
}

mtcars %>% mean_by(foo = disp)
#> Error: Column `foo` must be length 1 (a summary value), not 32

mtcars %>% mean_by(foo = mean(disp))
#> # A tibble: 1 x 1
#>     foo
#>   <dbl>
#> 1  231.

The solution is to quote the dots, modify each of the inputs so they are wrapped in a new call to mean(), and then splice them back:

mean_by <- function(data, ..., by) {
  # `.named` makes sure the dots have default names, if not supplied
  dots <- enquos(..., .named = TRUE)

  # Go over all inputs, and wrap them in a call
  dots <- lapply(dots, function(dot) call("mean", dot, na.rm = TRUE))

  # Finally, splice the expressions back into `summarise()`:
  data %>%
    group_by({{ by }}) %>%
    summarise(!!!dots)
}

We are considering how we could improve syntax for this case. Early thoughts at http://rpubs.com/lionel-/superstache

like image 163
Lionel Henry Avatar answered Nov 23 '22 09:11

Lionel Henry