I have some data which I want to clean up using a regular expression in R.
It is easy to find how to get elements that contain certain patterns, or do not contain certain words (strings), but I can't find out how to do this for excluding cells containing a pattern.
How could I use a general function to only keep those elements from a vector which do not contain PATTERN?
I prefer not to give an example, as this might lead people to answer using other (though usually nice) ways than the intended one: excluding based on a regular expression. Here goes anyway:
How to exclude all the elements that contain any of the following characters:
'pyfgcrl
vector <- c("Cecilia", "Cecily", "Cecily's", "Cedric", "Cedric's", "Celebes",
"Celebes's", "Celeste", "Celeste's", "Celia", "Celia's", "Celina")
The result would be an empty vector in this case.
Edit: From the comments, and with a little testing, one would find that my suggestion wasn't correct.
Here are two correct solutions:
vector[!grepl("['pyfgcrl]", vector)] ## kohske
grep("['pyfgcrl]", vector, value = TRUE, invert = TRUE) ## flodel
If either of them wants to re-post and accept credit for their answer, I'm more than happy to delete mine here.
The general function that you are looking for is grepl
. From the help file for grepl
:
grepl
returns a logical vector (match or not for each element ofx
).
Additionally, you should read the help page for regex
which describes what character classes are. In this case, you create a character class ['pyfgcrl]
, which says to look for any character in the square brackets. You can then negate this with !
.
So, up to this point, we have something that looks like:
!grepl("['pyfgcrl]", vector)
To get what you are looking for, you subset as usual.
vector[!grepl("['pyfgcrl]", vector)]
For the second solution, offered by @flodel, grep
by default returns the position where a match is made, and the value = TRUE
argument lets you return the actual string value instead. invert = TRUE
means to return the values that were not matched.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With