Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Use variable names in functions of dplyr

Tags:

r

r-faq

dplyr

rlang

I want to use variable names as strings in functions of dplyr. See the example below:

df <- data.frame(        color = c("blue", "black", "blue", "blue", "black"),        value = 1:5) filter(df, color == "blue") 

It works perfectly, but I would like to refer to color by string, something like this:

var <- "color" filter(df, this_probably_should_be_a_function(var) == "blue"). 

I would be happy, to do this by any means and super-happy to make use of easy-to-read dplyr syntax.

like image 541
kuba Avatar asked Jul 04 '14 07:07

kuba


People also ask

Can you use dplyr in a function?

dplyr functions use non-standard evaluation. That is why you do not have to quote your variable names when you do something like select(mtcars, mpg) , and why select(mtcars, "mpg") doesn't work. When you use dplyr in functions, you will likely want to use "standard evaluation".

What does %>% do in dplyr?

%>% is called the forward pipe operator in R. It provides a mechanism for chaining commands with a new forward-pipe operator, %>%. This operator will forward a value, or the result of an expression, into the next function call/expression. It is defined by the package magrittr (CRAN) and is heavily used by dplyr (CRAN).

Which of the following functions in dplyr package can be used to choose variables using their names?

select() and rename(): For choosing variables and using their names as a base for doing so.

How do I convert a string to a variable name in R?

We can assign character string to variable name by using assign() function. We simply have to pass the name of the variable and the value to the function.


1 Answers

In the newer versions, we can use we can create the variables as quoted and then unquote (UQ or !!) for evaluation

var <- quo(color) filter(df, UQ(var) == "blue") #   color value #1  blue     1 #2  blue     3 #3  blue     4 

Due to operator precedence, we may require () to wrap around !!

filter(df, (!!var) == "blue") #   color value #1  blue     1 #2  blue     3 #3  blue     4 

With new version, || have higher precedence, so

filter(df, !! var == "blue") 

should work (as @Moody_Mudskipper commented)

Older option

We may also use:

 filter(df, get(var, envir=as.environment(df))=="blue")  #color value  #1  blue     1  #2  blue     3  #3  blue     4 

EDIT: Rearranged the order of solutions

like image 200
akrun Avatar answered Oct 27 '22 00:10

akrun