Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why is this regex using lookbehinds invalid in R?

Tags:

regex

r

I'm trying to do a lookbehind regex in R to find a pattern. I expect this would pull the 'b' in 'bob', but instead I get an error.

> regexpr("(?<=a)b","thingamabob")
Error in regexpr("(?<=a)b", "thingamabob") : 
invalid regular expression '(?<=a)b', reason 'Invalid regexp'

This does not throw an error, but it also doesn't find anything.

> regexpr("(.<=a)b","thingamabob")
[1] -1
attr(,"match.length")
[1] -1
attr(,"useBytes")
[1] TRUE

I'm confused because the help page for regexpr specifically indicates that lookbehind should work: http://stat.ethz.ch/R-manual/R-patched/library/base/html/regex.html

Any ideas?

like image 470
Jesse Avatar asked Nov 16 '12 16:11

Jesse


People also ask

What is lookbehind in regex?

Regex Lookbehind is used as an assertion in Python regular expressions(re) to determine success or failure whether the pattern is behind i.e to the right of the parser's current position. They don't match anything. Hence, Regex Lookbehind and lookahead are termed as a zero-width assertion.

What is lookaround in regex?

As we've seen, a lookaround looks left or right but it doesn't add any characters to the match to be returned by the regex engine. Likewise, an anchor such as ^ and a boundary such as \b can match at a given position in the string, but they do not add any characters to the match.

What is regex positive lookahead?

Positive lookahead: In this type the regex engine searches for a particular element which may be a character or characters or a group after the item matched. If that particular element is present then the regex declares the match as a match otherwise it simply rejects that match.


1 Answers

You just need to switch to PERL regular expressions by setting perl = TRUE.

like image 141
joran Avatar answered Oct 12 '22 08:10

joran