Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Non-standard evaluation (NSE) in dplyr's filter_ & pulling data from MySQL

I'd like to pull some data from a sql server with a dynamic filter. I'm using the great R package dplyr in the following way:

#Create the filter filter_criteria = ~ column1 %in% some_vector #Connect to the database connection <- src_mysql(dbname <- "mydbname",               user <- "myusername",               password <- "mypwd",               host <- "myhost")  #Get data data <- connection %>%  tbl("mytable") %>% #Specify which table  filter_(.dots = filter_criteria) %>% #non standard evaluation filter  collect() #Pull data 

This piece of code works fine but now I'd like to loop it somehow on all the columns of my table, thus I'd like to write the filter as:

#Dynamic filter i <- 2 #With a loop on this i for instance which_column <- paste0("column",i) filter_criteria <- ~ which_column %in% some_vector 

And then reapply the first code with the updated filter.

Unfortunately this approach doesn't give the expected results. In fact it does not give any error but doesn't even pull any result into R. In particular, I looked a bit into the SQL query generated by the two pieces of code and there is one important difference.

While the first, working, code generates a query of the form:

SELECT ... FROM ... WHERE  `column1` IN .... 

(` sign in the column name), the second one generates a query of the form:

SELECT ... FROM ... WHERE  'column1' IN .... 

(' sign in the column name)

Does anyone have any suggestion on how to formulate the filtering condition to make it work?

like image 814
Lorenzo Rossi Avatar asked Oct 21 '14 17:10

Lorenzo Rossi


1 Answers

It's not really related to SQL. This example in R does not work either:

df <- data.frame(      v1 = sample(5, 10, replace = TRUE),      v2 = sample(5,10, replace = TRUE) ) df %>% filter_(~ "v1" == 1) 

It does not work because you need to pass to filter_ the expression ~ v1 == 1 — not the expression ~ "v1" == 1.

To solve the problem, simply use the quoting operator quo and the dequoting operator !!

library(dplyr) which_column = quot(v1) df %>% filter(!!which_column == 1) 
like image 123
Matthew Avatar answered Sep 20 '22 22:09

Matthew