library(tidyverse)
input_name <- "birth_year"
input_value <- 19
quo(filter(starwars, !!input_name == !!input_value)) # line 5
quo(filter(starwars, !!sym(input_name) == !!input_value)) # line 6
What's the difference between line #5 and line #6, and the use of the sym() function? Why is sym() only required on the left side of the equation in line #6?
Is the point of sym() to take character strings and unquote them into symbols?
<quosure>
expr: ^filter(data, "birth_year" == 19)
env: global
<quosure>
expr: ^filter(data, birth_year == 19)
env: global
The answer is yes, the goal of sym() is to take character strings and parse them into symbols. The reason you need this on the left-hand side of the equality can be seen in ?filter:
...: Logical predicates defined in terms of the variables in ‘.data’. Multiple conditions are combined with ‘&’. Only rows where the condition evaluates to ‘TRUE’ are kept.
filter( starwars, "birth_year" == 19 ) will always return no results, because the string literal "birth_year" is never equal to the integer literal 19 (which gets implicitly coerced to the character literal "19" in the comparison). By using sym, you are effectively parsing that string into a symbol, forcing filter to look at the column called birth_year in data frame starwars, rather than the literal string "birth_year".
Conversely, you don't need sym() on the right-hand side of the equation, because there is no column 19 in starwars, and you're interested in the actual literal value 19 instead. If you were comparing two columns in the data frame, then you would want sym() on both sides of the equality. For example,
name1 <- "skin_color"
name2 <- "eye_color"
filter( starwars, !!sym(name1) == !!sym(name2) )
# # A tibble: 6 x 13
# name height mass hair_color skin_color eye_color birth_year gender homeworld
# <chr> <int> <dbl> <chr> <chr> <chr> <dbl> <chr> <chr>
# 1 Wick… 88 20 brown brown brown 8 male Endor
# 2 Jar … 196 66 none orange orange 52 male Naboo
# 3 Eeth… 171 NA black brown brown NA male Iridonia
# 4 Mas … 196 NA none blue blue NA male Champala
# ...
In the first case, the column is not evaluated, it is the string that gets evaluated. But, by converting to symbol and evaluate it, it returns the column values. The sym is required in the lhs because we are not trying to get the literal value, but to extract the column value
According to ?sym
sym() creates a symbol from a string and syms() creates a list of symbols from a character vector.
and the ?"!!"
The !! operator unquotes its argument. It gets evaluated immediately in the surrounding context.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With