Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Match only exact matches to dplyr matches() helper function

Tags:

r

dplyr

I am using the matches() helper function as part of an argument to select() in the dplyr function.

The function looks like this for a hypothetical df data frame:

select(df, variable_1_name, matches("variable_2_name"))

At least as I'm currently using it, variable_2_name must be passed as a string to select().

However, if there is another variable in df that matches "variable_2_name", such as "variable_2_name_recode", then matches() will match both of those variables. Is it possible to match only exact matches with a dplyr function, or with a different approach?

like image 803
Joshua Rosenberg Avatar asked Aug 18 '16 01:08

Joshua Rosenberg


People also ask

How to use filter () function in dplyr?

In our first example using filter () function in dplyr, we used the pipe operator “%>%” while using filter () function to select rows. Like other dplyr functions, we can also use filter () function without the pipe operator as shown below. And we will get the same results as shown above.

How to use the match R function?

Then we can use the match R function as follows: The match function returns the value 2; The value 5 was found at the second position of our example vector. Note: The match command returned only the first match, even though the value 5 matches also the fourth element of our example vector.

How do I use match to find an exact match?

MATCH can find exact matches or approximate matches. In this video, we'll look at how to use MATCH to find an exact match. The MATCH function takes three arguments: the lookup_value, which is the value you're looking up, the lookup_array, which is the list to look in, and match_type, which specifies exact or approximate matching.

What does the match function do in Python?

The MATCH function finds the relative position of an item in a list. MATCH can find exact matches or approximate matches. In this video, we'll look at how to use MATCH to find an exact match.


Video Answer


1 Answers

You can of course just do the following when a string is not required:

select(df, variable_1_name, variable_2_name)

matches takes a pattern so you can try

# '^' anchors the match at the beginning of the string and
# '$' anchors the match at the end of the string.
select(df, variable_1_name, matches("^variable_2_name$"))

this should just match variable_2_name exactly.

If you have a function doing the select based on a string for the column name you could do the following (as mentioned by Psidom in a comment). The first example is simpler and the second is more of what you are looking for.

### Example 1
### Given function and the 'df' with the column 'variable_2_name'
my_func <- function(df, colname) { df %>% select_(colname) }
my_func(df, 'variable_2_name') # Call with column name string

### Example 2
### Using one column name that is not a string with a string column name string.
### 'df' has columns 'variable_1_name' and 'variable_2_name'
my_func <- function(df, colname) {
    df %>% select_(quote(variable_1_name), colname)
}
### Call with column name returns 2 columns of data
### 'variable_1_name' and 'variable_2_name'
my_func(df, 'variable_2_name')

Edit

dplyr::select_ is now deprecated, but the code above should be changeable to use dplyr::select instead of dplyr::select_.

like image 171
steveb Avatar answered Sep 27 '22 19:09

steveb