Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Use both empty and string filters in dplyr's filter

Tags:

r

dplyr

nse

I'm updating an old script using the deprecated dplyr::filter_() to use dplyr::filter(). But I can't get it to work for empty filter strings anymore:

Example:

library(dplyr)
my_df <- tibble::tibble(x = sample(c(0:9), 100, replace = TRUE))

deprecated filter_() works for both string and empty strings

fil1 <- "x == 5"
filter_(mydf, .dots = fil1) # works

fil2 <- NULL
filter_(mydf, .dots = fil2) # works, returns all values

NSE version works only with quoted filter values, but not with empty ones

fil1 = quo(x == 5)
filter(my_df, !!enquo(fil1)) # works

fil2 = NULL
filter(my_df, !!enquo(fil2)) 
Error: Argument 2 filter condition does not evaluate to a logical vector

fil2 = quo(NULL)
filter(my_df, !!enquo(fil2))
Error: Argument 2 filter condition does not evaluate to a logical vector

I see three possible approaches to this:

  • quote NULL differently
  • use another expression instead of NULL
  • use another argument inside filter()
like image 779
Timm S. Avatar asked May 15 '20 14:05

Timm S.


People also ask

How do you filter two things in R?

In this, first, pass your dataframe object to the filter function, then in the condition parameter write the column name in which you want to filter multiple values then put the %in% operator, and then pass a vector containing all the string values which you want in the result.

How do I filter certain strings in R?

Often you may want to filter rows in a data frame in R that contain a certain string. Fortunately this is easy to do using the filter() function from the dplyr package and the grepl() function in Base R.

How do I filter not in R?

How to Use “not in” operator in Filter, To filter for rows in a data frame that is not in a list of values, use the following basic syntax in dplyr. df %>% filter(! col_name %in% c('value1', 'value2', 'value3', ...))


2 Answers

If you specify your filters as lists of expressions (or NULL), you can use the unquote-splice operator to effectively "paste" them as arguments to filter. Use parse_exprs() to convert strings to expressions:

fil1 <- rlang::exprs(x == 5)          # Note the s on exprs
filter(my_df, !!!fil1)                # Works

fil2 <- NULL                          # NULL
filter(my_df, !!!fil2)                # Also works

fil3 <- rlang::parse_exprs("x==5")    # Again, note the plural s
filter(my_df, !!!fil3)                # Also works

The first and the third calls are effectively filter(my_df, x==5), while the second call is effectively filter(my_df,).

like image 142
Artem Sokolov Avatar answered Oct 16 '22 23:10

Artem Sokolov


If I understand you correctly @timm-s the second bullet option means I can offer this solution.

set.seed(2020)
library(dplyr)

my_df <- tibble::tibble(x = sample(c(0:9), 100, replace = TRUE))

fil1 <- quo(x == 5)
filter(my_df, !!enquo(fil1)) # works
#> # A tibble: 11 x 1
#>        x
#>    <int>
#>  1     5
#>  2     5
#>  3     5
#>  4     5
#>  5     5
#>  6     5
#>  7     5
#>  8     5
#>  9     5
#> 10     5
#> 11     5

fil2 <- TRUE
filter(my_df, !!enquo(fil2)) 
#> # A tibble: 100 x 1
#>        x
#>    <int>
#>  1     6
#>  2     5
#>  3     7
#>  4     0
#>  5     0
#>  6     3
#>  7     9
#>  8     5
#>  9     0
#> 10     7
#> # … with 90 more rows

It simply relies on the fact that filter relies on true/false so instead of telling it nothing tell it true. For me the real question was why filter_ thought NULL was true LOL.

A little more playing revealed it's possible to simplify more for the empty case

fil3 <- TRUE
filter(my_df, fil3) 

will also work but may not fit your circumstances.

like image 2
Chuck P Avatar answered Oct 17 '22 01:10

Chuck P