Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do I invert the helper functions for dplyr::select?

Tags:

r

dplyr

How do I invert the helper functions for dplyr::select()(like matches() or contains()) so that I can select variables that do NOT contains or match a particular string?

For example, say I wanted to select all the columns in the mtcars data frame that did not have the letter "m" in them. I could imagine doing something like:

mtcars %>%
    select( !matches("m") )

But that throws the error:

Error: !matches("m") must resolve to integer column positions, not a logical vector

How do I write the helper function to invert it?

Important note: one possibility is to use matches() and write a regular expression that doesn't match, but I'm more interested in finding a way to maintain the simplicity of the helper functions but invert the selection they return, rather than solving the actual "how do I select such-and-such" problem.

like image 572
crazybilly Avatar asked Aug 29 '17 14:08

crazybilly


1 Answers

The helper functions for select() like matches(), contains(), starts_with() and so on, return a vector of index values. In the example above, if we didn't want the inverse, matches("m") would return c(1,9) because the first and ninth column names contain "m".

With that in mind, all we have to do is make the function negative:

mtcars %>%
    select( -matches("m") )

That makes matches("m") return a vector of c(-1, -9) which deselects those columns but leaves everything else.

Using !, the boolean NOT, as shown the in the original example, coerces the integer values to logical, so instead of c(1,9), you end up with c(FALSE, FALSE) since both 1 and 9 coerce to TRUE but then are inverted by the !.

This explains the error R was throwing above--select() wants a list of integers, corresponding to column indexes, not a vector of logical values.

like image 137
crazybilly Avatar answered Oct 19 '22 08:10

crazybilly