I am trying to use grep
to test whether a vector of strings are present in an another vector or not, and to output the values that are present (the matching patterns).
I have a data frame like this:
FirstName Letter Alex A1 Alex A6 Alex A7 Bob A1 Chris A9 Chris A6
I have a vector of strings patterns to be found in the "Letter" columns, for example: c("A1", "A9", "A6")
.
I would like to check whether the any of the strings in the pattern vector is present in the "Letter" column. If they are, I would like the output of unique values.
The problem is, I don't know how to use grep
with multiple patterns. I tried:
matches <- unique ( grep("A1| A9 | A6", myfile$Letter, value=TRUE, fixed=TRUE) )
But it gives me 0 matches which is not true, any suggestions?
Example 2: Apply grep & grepl with Multiple PatternsWe can also use grep and grepl to check for multiple character patterns in our vector of character strings. We simply need to insert an |-operator between the patterns we want to search for.
Both functions allow you to see whether a certain pattern exists in a character string, but they return different results: grepl() returns TRUE when a pattern exists in a character string. grep() returns a vector of indices of the character strings that contain the pattern.
The basic grep syntax when searching multiple patterns in a file includes using the grep command followed by strings and the name of the file or its path. The patterns need to be enclosed using single quotes and separated by the pipe symbol. Use the backslash before pipe | for regular expressions.
The grepl() stands for “grep logical”. In R it is a built-in function that searches for matches of a string or string vector. The grepl() method takes a pattern and data and returns TRUE if a string contains the pattern, otherwise FALSE.
In addition to @Marek's comment about not including fixed==TRUE
, you also need to not have the spaces in your regular expression. It should be "A1|A9|A6"
.
You also mention that there are lots of patterns. Assuming that they are in a vector
toMatch <- c("A1", "A9", "A6")
Then you can create your regular expression directly using paste
and collapse = "|"
.
matches <- unique (grep(paste(toMatch,collapse="|"), myfile$Letter, value=TRUE))
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With