Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

php regex: Alternative to backreference in negative lookbehind

I want to find instances where a captured group does not appear later in the string:

aaaBbb  = CccBbb  <- format is valid, skip
aaaDddd = CccDddd <- format is valid, skip
aaaEeee = CccFfff <- format is not valid, match this one only

So this matches the lines I don't want to match ( https://regex101.com/r/lon87L/1 )

/^ +\w+([A-Z][a-z+]) += +\w+\1$/mg

I've read on https://www.regular-expressions.info/refadv.html that php doesn't support backreferences inside a negative lookbehind, but other implementations of regex can. So something like this would match the invalid lines that I want to match, but it doesn't work in php:

/^ +\w+([A-Z][a-z+]) += +\w+(?<!\1)$/mg

Is there anything else that would work, other than matching all of three lines and looping through the matches in a php foreach?

like image 897
Redzarf Avatar asked Oct 16 '22 10:10

Redzarf


2 Answers

Try using using a negative lookahead instead of a negative lookbehind. It works equally well, plus it works in PHP.

^ +\w+([A-Z][a-z]+) += +(?!\w+\1).*$

regex101 demo

PHP demo

like image 84
Ethan Avatar answered Oct 20 '22 23:10

Ethan


One option would be to, right before each repeated \w after the =, use negative lookahead for \1$:

^ +\w+([A-Z][a-z]+) += +(?:(?!\1$)\w)+$
                        ^^^^^^^^^^^^^^

https://regex101.com/r/lon87L/2

But that only excludes a match if the backreference occurs right at the end of the string. If you want to ensure that the previously matched phrase doesn't occur anywhere within the final \ws, just remove the $ from inside the repeated group:

^ +\w+([A-Z][a-z]+) += +(?:(?!\1)\w)+$
                                ^

https://regex101.com/r/lon87L/3

like image 32
CertainPerformance Avatar answered Oct 21 '22 01:10

CertainPerformance