I'm trying to do a lookbehind regex in R to find a pattern. I expect this would pull the 'b' in 'bob', but instead I get an error.
> regexpr("(?<=a)b","thingamabob")
Error in regexpr("(?<=a)b", "thingamabob") :
invalid regular expression '(?<=a)b', reason 'Invalid regexp'
This does not throw an error, but it also doesn't find anything.
> regexpr("(.<=a)b","thingamabob")
[1] -1
attr(,"match.length")
[1] -1
attr(,"useBytes")
[1] TRUE
I'm confused because the help page for regexpr specifically indicates that lookbehind should work: http://stat.ethz.ch/R-manual/R-patched/library/base/html/regex.html
Any ideas?
Regex Lookbehind is used as an assertion in Python regular expressions(re) to determine success or failure whether the pattern is behind i.e to the right of the parser's current position. They don't match anything. Hence, Regex Lookbehind and lookahead are termed as a zero-width assertion.
As we've seen, a lookaround looks left or right but it doesn't add any characters to the match to be returned by the regex engine. Likewise, an anchor such as ^ and a boundary such as \b can match at a given position in the string, but they do not add any characters to the match.
Positive lookahead: In this type the regex engine searches for a particular element which may be a character or characters or a group after the item matched. If that particular element is present then the regex declares the match as a match otherwise it simply rejects that match.
You just need to switch to PERL regular expressions by setting perl = TRUE
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With