I have two vectors that I want to grep, but I want to keep the order in the pattern to grep. I solve it using a loop, although I'm wondering if there is any other (better) way of doing it.
EG.
to_match <- c("KZB8","KBB9","KBC9","KZA9","KZB2","KZB5","KZB6")
vectorA <- c("RuL_KZA9","RuL_KZB9","RuL_KZA5","RuL_KZC6","RuL_KZB8")
I solved like this:
matching <- c()
for (i in to_match){
t <- grep(i, vectorA, value = T)
matching <- c(matching,t)
}
> matching
[1] "RuL_KZB8" "RuL_KZA9"
BTW, I saw the great answers here: grep using a character vector with multiple patterns
But as you will see see the problem with:
grep(paste(to_match, collapse = "|"),vectorA, value = T)
[1] "RuL_KZA9" "RuL_KZB8"
is that the matching is sorted based on the first element that grep finds and not using the matching vector.
Thanks in advance for your ideas for a more efficient code.
Niko
We can also use grep and grepl to check for multiple character patterns in our vector of character strings. We simply need to insert an |-operator between the patterns we want to search for. As you can see, both functions where searching for multiple pattern in the previous R code (i.e. “a” or “c”).
The grepl() stands for “grep logical”. In R it is a built-in function that searches for matches of a string or string vector. The grepl() method takes a pattern and data and returns TRUE if a string contains the pattern, otherwise FALSE.
Both functions allow you to see whether a certain pattern exists in a character string, but they return different results: grepl() returns TRUE when a pattern exists in a character string. grep() returns a vector of indices of the character strings that contain the pattern.
The basic grep syntax when searching multiple patterns in a file includes using the grep command followed by strings and the name of the file or its path. The patterns need to be enclosed using single quotes and separated by the pipe symbol. Use the backslash before pipe | for regular expressions.
Try lapply
:
unlist(lapply(to_match, grep, vectorA, value = TRUE))
## [1] "RuL_KZB8" "RuL_KZA9"
or
unlist(sapply(to_match, grep, vectorA, value = TRUE))
## KZB8 KZA9
## "RuL_KZB8" "RuL_KZA9"
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With