A function for filtering, grouping and mutating data with dplyr functions. Basic pipe sequence works great outside a function, that is where I use the true column names. Put it in a function where the column name is a variable and some of the functions work but some don't most notably dplyr::filter(). For example: <pre class="prettyprint"><code>var1 <- c('yes', NA, NA, 'yes', 'yes', NA, NA, NA, 'yes', NA, 'no', 'no', 'no', 'maybe', NA, 'maybe', 'maybe', 'maybe') var2 <- c(1:18) df <- data.frame(var1, var2) </code></pre> This works fine (i.e. filters NA's): <pre class="prettyprint"><code>df%>%filter(!is.na(var1)) </code></pre> ...but this doesn't: <pre class="prettyprint"><code>x <- "var1" df%>%filter(!is.na(x)) </code></pre> ...but this does: <pre class="prettyprint"><code>df%>%select(x) </code></pre> It's NA's that need to be filtered out specifically. Tried get("x"), no good, and slicing: <pre class="prettyprint"><code>df[!is.na(x),] </code></pre> ...no good, either. Any ideas on how to pass a variable to filter inside (or outside) a function and why a variable is working with other dplyr functions?

We can use the <code>sym</code> to convert to a symbol and then with <code>UQ</code> evaluate it <pre class="prettyprint"><code>library(rlang) library(dplyr) df %>% filter(!is.na(UQ(sym(x)))) # var1 var2 #1 yes 1 #2 yes 4 #3 yes 5 #4 yes 9 #5 no 11 #6 no 12 #7 no 13 #8 maybe 14 #9 maybe 16 #10 maybe 17 #11 maybe 18 </code></pre>

Why doesn't dplyr filter() work within function (i.e. using variable for column name)?

Tags:

r

filter

dplyr

A function for filtering, grouping and mutating data with dplyr functions. Basic pipe sequence works great outside a function, that is where I use the true column names. Put it in a function where the column name is a variable and some of the functions work but some don't most notably dplyr::filter(). For example:

var1 <- c('yes', NA, NA, 'yes', 'yes', NA, NA, NA, 'yes', NA, 'no', 'no', 'no', 'maybe', NA, 'maybe', 'maybe', 'maybe')

var2 <- c(1:18)

df <- data.frame(var1, var2)

This works fine (i.e. filters NA's):

df%>%filter(!is.na(var1))

...but this doesn't:

x <- "var1"

df%>%filter(!is.na(x))

...but this does:

df%>%select(x)

It's NA's that need to be filtered out specifically.

Tried get("x"), no good, and slicing:

df[!is.na(x),]

...no good, either.

Any ideas on how to pass a variable to filter inside (or outside) a function and why a variable is working with other dplyr functions?

723

asked Jul 23 '17 03:07

Conner M.

1 Answers

We can use the sym to convert to a symbol and then with UQ evaluate it

library(rlang)
library(dplyr)
df %>%
   filter(!is.na(UQ(sym(x))))
#     var1 var2
#1    yes    1
#2    yes    4
#3    yes    5
#4    yes    9
#5     no   11
#6     no   12
#7     no   13
#8  maybe   14
#9  maybe   16
#10 maybe   17
#11 maybe   18

answered Nov 14 '22 21:11

akrun

Related questions
                            
                                Shiny Leaflet - hiding/removing legend
                            
                                Python Negative Binomial Regression - Results Don't Match those from R
                            
                                bit64 integers with fst
                            
                                Using discrete custom color in a plotly heatmap
                            
                                How to send json response using plumber R
                            
                                Replace Values in Dataframe Column based on match in second data frame columns
                            
                                How to remove more than 2 consecutive NA's in a column?
                            
                                Compilation error using Rcpp with typedef
                            
                                Change parameter values at time step in deSolve
                            
                                Plot the positive infinity symbol and negative infinity symbol
                            
                                Draw a map of a specific country with leaflet
                            
                                ggplot not plotting the correct color [duplicate]
                            
                                Multiply each column of a data frame by the corresponding value of a vector [duplicate]
                            
                                Change the input value in shiny from server
                            
                                Efficient way to insert data frame from R to SQL
                            
                                Read the file created/modified last in different directories in R
                            
                                Numerical Triple Integration in R
                            
                                R ODBC - Querying Column name with spaces
                            
                                Extract rows that have duplicates for certain column but are unique in another column
                            
                                how to replace a character INSIDE the text content of many files automatically?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With