I have the following vector in R and I would like to find all the strings containing A's and B's but not the number 2.
vec1<-c("A_cont_1", "A_cont_12", "B_treat_8", "AB_cont_22", "cont_21_Aa")
The following does not work:
grep("A|B|!2", vec1)
It gives me back all the strings:
[1] 1 2 3 4 5
The same is true for this example:
grep("A|B|-2", vec1)
What would be the correct syntax?
grep() function in R Language is used to search for matches of a pattern within each element of the given string. Parameters: pattern: Specified pattern which is going to be matched with given elements of the string.
17.4 grepl() grepl() returns a logical vector indicating which element of a character vector contains the match. For example, suppose we want to know which states in the United States begin with word “New”. Here, we can see that grepl() returns a logical vector that can be used to subset the original state.name vector.
The grep returns indices of matched items or matched items themselves while grepl returns a logical vector with TRUE to represent a match and FALSE otherwise. Both functions can be used to match a pattern to change or replace it or to filter data.
You can do this with a fairly simple regular expression:
grep("^[^2]*[AB][^2]*$", vec1)
In words, it means:
^
match the start of the string[^2]*
match anything except "2", zero or more times[AB]
match "A" or "B"[^2]*
match anything except "2", zero or more times$
match the end of the stringI would use two grep
calls:
intersect(grep("A|B",vec1),grep("2",vec1,invert=TRUE)) #[1] 1 3
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With