Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Negative lookahead with capturing groups

I'm attempting this challenge:

https://regex.alf.nu/4

I want to match all strings that don't contain an ABBA pattern.

Match:

aesthophysiology
amphimictical
baruria
calomorphic

Don't Match

anallagmatic
bassarisk
chorioallantois
coccomyces
abba

Firstly, I have a regex to determine the ABBA pattern.

(\w)(\w)\2\1

Next I want to match strings that don't contain that pattern:

^((?!(\w)(\w)\2\1).)*$

However this matches everything.

If I simplify this by specifying a literal for the negative lookahead:

^((?!agm).)*$

The the regex does not match the string "anallagmatic", which is the desired behaviour.

So it looks like the issue is with me using capturing groups and back-references within the negative lookahead.

like image 217
JNB Avatar asked Sep 30 '15 09:09

JNB


People also ask

What is a negative lookahead?

In this type of lookahead the regex engine searches for a particular element which may be a character or characters or a group after the item matched. If that particular element is not present then the regex declares the match as a match otherwise it simply rejects that match.

What is a capturing group regex?

Capturing groups are a way to treat multiple characters as a single unit. They are created by placing the characters to be grouped inside a set of parentheses. For example, the regular expression (dog) creates a single group containing the letters "d" "o" and "g" .

Does grep support negative lookahead?

You probably cant perform standard negative lookaheads using grep, but usually you should be able to get equivalent behaviour using the "inverse" switch '-v'. Using that you can construct a regex for the complement of what you want to match and then pipe it through 2 greps.

What is Lookbehind in regex?

Lookbehind, which is used to match a phrase that is preceded by a user specified text. Positive lookbehind is syntaxed like (? <=a)something which can be used along with any regex parameter. The above phrase matches any "something" word that is preceded by an "a" word.


1 Answers

^(?!.*(.)(.)\2\1).+$

    ^^

You can use a lookahead here.See demo.The lookahead you created was correct but you need add .* so that it cannot appear anywhere in the string.

https://regex101.com/r/vV1wW6/39

Your approach will also work if you make the first group non capturing.

^(?:(?!(\w)(\w)\2\1).)*$

 ^^

See demo.It was not working because \2 \1 were different than what you intended.In your regex they should have been \3 and \2.

https://regex101.com/r/vV1wW6/40

like image 91
vks Avatar answered Oct 19 '22 20:10

vks