Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Filter data frame by character column name (in dplyr)

Tags:

r

dplyr

I have a data frame and want to filter it in one of two ways, by either column "this" or column "that". I would like to be able to refer to the column name as a variable. How (in dplyr, if that makes a difference) do I refer to a column name by a variable?

library(dplyr) df <- data.frame(this = c(1, 2, 2), that = c(1, 1, 2)) df #   this that # 1    1    1 # 2    2    1 # 3    2    2 df %>% filter(this == 1) #   this that # 1    1    1 

But say I want to use the variable column to hold either "this" or "that", and filter on whatever the value of column is. Both as.symbol and get work in other contexts, but not this:

column <- "this" df %>% filter(as.symbol(column) == 1) # [1] this that # <0 rows> (or 0-length row.names) df %>% filter(get(column) == 1) # Error in get("this") : object 'this' not found 

How can I turn the value of column into a column name?

like image 553
William Denton Avatar asked Nov 29 '14 00:11

William Denton


People also ask

How do I filter a column name in R?

To filter a single column of a matrix in R if the matrix has column names, we can simply use single square brackets but this will result in a vector without the column name. If we want to use the column name then column name or column number needs to be passed with drop=FALSE argument as shown in the below examples.

How do I filter categorical data in R?

Use inbuilt data sets or create a new data set and look at top few rows in the data set. Then, look at the bottom few rows in the data set. Check the data structure. Filter the data by categorical column using split function.


1 Answers

Using rlang's injection paradigm

From the current dplyr documentation (emphasis by me):

dplyr used to offer twin versions of each verb suffixed with an underscore. These versions had standard evaluation (SE) semantics: rather than taking arguments by code, like NSE verbs, they took arguments by value. Their purpose was to make it possible to program with dplyr. However, dplyr now uses tidy evaluation semantics. NSE verbs still capture their arguments, but you can now unquote parts of these arguments. This offers full programmability with NSE verbs. Thus, the underscored versions are now superfluous.

So, essentially we need to perform two steps to be able to refer to the value "this" of the variable column inside dplyr::filter():

  1. We need to turn the variable column which is of type character into type symbol.

    Using base R this can be achieved by the function as.symbol() which is an alias for as.name(). The former is preferred by the tidyverse developers because it

    follows a more modern terminology (R types instead of S modes).

    Alternatively, the same can be achieved by rlang::sym() from the tidyverse.

  2. We need to inject the symbol from 1) into the dplyr::filter() expression.

    This is done by the so called injection operator !! which is basically syntactic sugar allowing to modify a piece of code before R evaluates it.

    (In earlier versions of dplyr (or the underlying rlang respectively) there used to be situations (incl. yours) where !! would collide with the single !, but this is not an issue anymore since !! gained the right operator precedence.)

Applied to your example:

library(dplyr) df <- data.frame(this = c(1, 2, 2),                  that = c(1, 1, 2)) column <- "this"  df %>% filter(!!as.symbol(column) == 1) #   this that # 1    1    1 

Using alternative solutions

Other ways to refer to the value "this" of the variable column inside dplyr::filter() that don't rely on rlang's injection paradigm include:

  • Via the tidyselection paradigm, i.e. dplyr::if_any()/dplyr::if_all() with tidyselect::all_of()

    df %>% filter(if_any(.cols = all_of(column),                      .fns = ~ .x == 1)) 
  • Via rlang's .data pronoun and base R's [[:

    df %>% filter(.data[[column]] == 1) 
  • Via magrittr's . argument placeholder and base R's [[:

    df %>% filter(.[[column]] == 1) 
like image 76
Salim B Avatar answered Oct 05 '22 14:10

Salim B