Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Use dplyr::case_when with arguments programmatically

Tags:

r

dplyr

rlang

I'd like to be able to use dplyr's case_when in a programmatic way to replace base R cut()function.

Currently, case_when can be used with an external argument through NSE like :

library(dplyr)
library(rlang)

patterns <- list(
  x <= 2 ~ "<=2",
  x <= 4 ~ "2<->4",
  x > 4 ~ ">4"
 )
 x <- 1:10
 case_when(!!!patterns)

What I want to do is : use it with another variable, inside a mutate

The idea would be something like this, although I can't figure out how to get it to work :

library(dplyr)
patterns_lazy <- list(
  !!quo(x) <= 2 ~ "<=2",
  !!quo(x) <= 4 ~ "2<->4",
  !!quo(x) > 4 ~ ">4"
)
x <- "cyl"
mtcars %>% mutate(ABC = case_when(!!!patterns_lazy))

I'd like to be able to define the column (inside a string) that I want to filter on, and retrieve something like this (this example is not working as it's the desired syntax):

x <- "cyl"
mtcars %>%
  select(cyl) %>%
  mutate(ABC = case_when(!!!patterns_lazy)) %>%
  head()

  cyl ABC
1   6 >4
2   6 >4
3   4 2<->4
4   6 >4
5   8 >4
6   6 >4

Thanks for any help :)

like image 892
RobinCura Avatar asked Jun 29 '17 10:06

RobinCura


1 Answers

You cannot use !! there:

patterns <- list(
  !!quo(x) <= 2 ~ "<=2",
  !!quo(x) <= 4 ~ "2<->4",
  !!quo(x) > 4 ~ ">4"
)
  1. Neither list() nor ~ support quasiquotation.
  2. If it did support quasiquotation, you'd need to be careful with operator precedence and wrap your !!quo() in parentheses.
  3. And finally, that quote to x would evaluate to a string and you'd be comparing numbers with strings (in your example it would be "cyl) which R will do happily thanks to implicit coercions :/

So you need to use exprs() instead of list(), and use x with the .data pronoun instead of a quote of x.

exprs() will create a list of unevaluated expressions. Unevaluated is good: if your formula was evaluated, it would carry an environment (here the global env) and that environment doesn't contain any of the data supplied to dplyr, and in particular doesn't have the .data pronoun. On the other hand if the formulas are "context-less", they get evaluated in the data context which is what we want.

patterns_lazy <- exprs(
  .data[[x]] <= 2 ~ "<=2",
  .data[[x]] <= 4 ~ "2<->4",
  .data[[x]] > 4 ~ ">4"
)

x <- "cyl"
pull(mutate(mtcars, case_when(!!!patterns_lazy)))
#>  [1] ">4"    ">4"    "2<->4" ">4"    ">4"    ">4"    ">4"    "2<->4" "2<->4"
#> [10] ">4"    ">4"    ">4"    ">4"    ">4"    ">4"    ">4"    ">4"    "2<->4"
#> [19] "2<->4" "2<->4" "2<->4" ">4"    ">4"    ">4"    ">4"    "2<->4" "2<->4"
#> [28] "2<->4" ">4"    ">4"    ">4"    "2<->4"
like image 78
Lionel Henry Avatar answered Oct 11 '22 05:10

Lionel Henry