I use a dynamic variable (eg. ID
) as a way to reference a column name that will change depending on which gene I am processing at the time. I then use case_when
within mutate
to create a new column that will have values that depend on the dynamic column.
I thought that !!
(bang-bang) was what I needed to force eval of the content of the variable; however, I did not get the expected output in my new column. Only the !!as.name
gave me the output I was expecting, and I do not fully understand why. Could someone explain why in this case using only !!
isn't appropriate and what is happening in !!as.name
?
Here is a simple reproducible example that I made up to demo what I am experiencing:
library(tidyverse)
ID <- "birth_year"
# Correct output
test <- starwars %>%
mutate(FootballLeague = case_when(
!!as.name(ID) < 10 ~ "U10",
!!as.name(ID) >= 10 & !!as.name(ID) < 50 ~ "U50",
!!as.name(ID) >= 50 & !!as.name(ID) < 100 ~ "U100",
!!as.name(ID) >= 100 ~ "Senior",
TRUE ~ "Others"
))
# Incorrect output
test2 <- starwars %>%
mutate(FootballLeague = case_when(
!!(ID) < 10 ~ "U10",
!!(ID) >= 10 & !!(ID) < 50 ~ "U50",
!!(ID) >= 50 & !!(ID) < 100 ~ "U100",
!!(ID) >= 100 ~ "Senior",
TRUE ~ "Others"
))
# Incorrect output
test3 <- starwars %>%
mutate(FootballLeague = case_when(
as.name(ID) < 10 ~ "U10",
as.name(ID) >= 10 & as.name(ID) < 50 ~ "U50",
as.name(ID) >= 50 & as.name(ID) < 100 ~ "U100",
as.name(ID) >= 100 ~ "Senior",
TRUE ~ "Others"
))
identical(test, test2)
# FALSE
identical(test2, test3)
# TRUE
sessionInfo()
#R version 4.0.2 (2020-06-22)
#Platform: x86_64-centos7-linux-gnu (64-bit)
#Running under: CentOS Linux 7 (Core)
# tidyverse_1.3.0
# dplyr_1.0.2
Cheers!
The term Bang Bang is in reference to the sounds used to prepare the meat; the sound the cleaver made when it cut the chicken into small pieces and the pounding of the chicken by a wooden club to tenderize the meat. Interestingly, the word bàng (榜) in Chinese simply means “stick.”
Why is the exclamation mark called a bang? Bang is used to mean the sound of something falling but these days I hear it frequently used to mean the exclamation mark, especially in IT related texts. Show activity on this post.
The name shebang for the distinctive two characters comes from an inexact contraction of SHArp bang or haSH bang, referring to the two typical Unix names for them. Another theory on the sh in shebang is that it is from the default shell sh, usually invoked with shebang.
Bang Bang sauce is a sweet, spicy, and creamy mayonnaise-based sauce. It has been traditionally eaten with Bang Bang chicken; a Chinese street food dish that consisted of deep-fried chicken pieces. Although, it is now widely eaten with fried seafood (such as shrimp) or as a dipping sauce for any other fried or grilled foods.
You can wrap your expressions in the function quo()
to see the result of the operation after applying the !!
operator. For simplicity I will use a shorter expression for demonstration:
Preparations:
library(tidyverse)
ID <- "birth_year"
## Test without quasiquotation:
starwars %>%
filter(birth_year < 50)
Experiment 1:
quo(
starwars %>%
filter(ID < 50)
)
## result: starwars %>% filter(ID < 50)
We learn: filter()
does not treat ID
as variable, but "as is". So we need a mechanism to tell filter()
that it should treat ID
as variable, and it should use its value.
--> The !!
operator can be used to tell filter()
it should treat an expression as variable and substitute its value.
Experiment 2:
quo(
starwars %>%
filter(!!ID < 50)
)
## result: starwars %>% filter("birth_year" < 50)
We learn: The !!
operator has indeed worked: ID
was replaced with its value. But: The value of ID
is the string "birth_year"
. Note the quotes in the result. But as you probably know, tidyverse functions don't take variable names as strings, they want the raw names, without quotes. Compare with Experiment 1: filter()
takes everything "as is", so it looks for a column named "birth_year"
(including the quotes!)
What does the function as.name()
do?
This is a base R fuction that takes a string (or a variable containing a string) and returns the content of the string as variable name.
So if you call as.name(ID)
in base R, the result is birth_year
, this time without quotes - just like the tidyverse expects it. So let's try it:
Experiment 3:
quo(
starwars %>%
filter(as.name(ID) < 50)
)
## result: starwars %>% filter(as.name(ID) < 50)
We learn: This did not work, because, again, filter()
takes everything "as is". So now it looks for column named as.name(ID)
, which does of course not exist.
--> We need to combine the two things to make it work:
as.name()
to convert the string to a variable name.!!
to tell filter()
it should not take things "as is", but substitute the real value.Experiment 4:
quo(
starwars %>%
filter(!!as.name(ID) < 50)
)
## result: starwars %>% filter(birth_year < 50)
Now it works! :)
I have used filter()
in my experiments, but it works exactly the same with mutate()
and other tidyverse functions.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With