I have a vector of strings:
s <- c('abc1', 'abc2', 'abc3', 'abc11', 'abc12',
'abcde1', 'abcde2', 'abcde3', 'abcde11', 'abcde12',
'nonsense')
I would like a regular expression to match only the strings that begin with abc
and end with 3
, 11
, or 12
. In other words, the regex has to exclude abc1
but not abc11
, abc2
but not abc12
, and so on.
I thought that this would be easy to do with lookahead assertions, but I haven't found a way. Is there one?
EDIT: Thanks to posters below for pointing out a serious ambiguity in the original post.
In reality, I have many strings. They all end in digits: some in 0, some in 9, some in the digits in between. I am looking for a regex that will match all strings except those that end with a letter followed by a 1 or a 2. (The regex should also match only those strings that start with abc
, but that's an easy problem.)
I tried to use negative lookahead assertions to create such a regex. But I didn't have any success.
Thanks to all who replied and commented. Inspired by several of you, I ended up using this combination: grepl('^abc', s) & !grepl('[[:lower:]][12]$', s)
.
Instead of one complicated regular expression, in this case I think it's easier to use two simple regular expressions:
s <- c('abc1', 'abc2', 'abc3', 'abc11', 'abc12',
'abcde1', 'abcde2', 'abcde3', 'abcde11', 'abcde12',
'nonsense')
s[grepl("^abc", s) & grepl("(3|11|12)$", s)]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With