Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

regex: Why this negative lookahead doesn't work?

Tags:

regex

I have text like this

real:: a
real :: b
real c

now I want to match the those real without :: followed, and in this case, I want to match only the 3rd real. so I tried regex with lookahead

real\s*(?!::)

But this matches

real :: b
real c

For \s* means zero or more \s, why real :: b is being matched?

update

Thanks to Wiktor Stribiżew. Using regex101 debugging tool. We can find backtrack makes thing complicated.

I came up with another task that is similar but I can't solve

real (xx(yy)) :: a
real (zz(pp)):: b
real (cc(rr)) c

again, I want to match real (cc(rr)) which is without :: following.

real\s*\(.*?\)+(?!\s*::)

This is what I tried, but failed. Look into regex debug, it is also due to backtrack. But how to do this correctly?

like image 247
user15964 Avatar asked Jul 25 '16 16:07

user15964


People also ask

Can I use negative lookahead?

Negative lookahead That's a number \d+ , NOT followed by € . For that, a negative lookahead can be applied. The syntax is: X(?! Y) , it means "search X , but only if not followed by Y ".

Can I use regex lookahead?

Lookahead assertions are part of JavaScript's original regular expression support and are thus supported in all browsers.

How do you use negative Lookbehind regex?

Negative Lookbehind Syntax:Where match is the item to match and element is the character, characters or group in regex which must not precede the match, to declare it a successful match. So if you want to avoid matching a token if a certain token precedes it you may use negative lookbehind. For example / (? <!

Does grep support negative lookahead?

Negative lookahead, which is what you're after, requires a more powerful tool than the standard grep . You need a PCRE-enabled grep. If you have GNU grep , the current version supports options -P or --perl-regexp and you can then use the regex you wanted.


1 Answers

You need to put the \s* into the lookahead:

real(?!\s*::)

See the regex demo

The real\s*(?!::) matches real because the real matches real, then the \s* matches 0 or more whitespaces, then the lookahead fails the match at the :: and the engine backtracks, that is, it frees the space matched with \s* and tries to re-match the string. Since the \s* can match an empty string, the real before :: b gets matched.

See the regex debugger scheme at regex101 showing what is going on behind the scenes:

enter image description here

like image 173
Wiktor Stribiżew Avatar answered Jan 03 '23 05:01

Wiktor Stribiżew