I want to find instances where a captured group does not appear later in the string:
aaaBbb = CccBbb <- format is valid, skip
aaaDddd = CccDddd <- format is valid, skip
aaaEeee = CccFfff <- format is not valid, match this one only
So this matches the lines I don't want to match ( https://regex101.com/r/lon87L/1 )
/^ +\w+([A-Z][a-z+]) += +\w+\1$/mg
I've read on https://www.regular-expressions.info/refadv.html that php doesn't support backreferences inside a negative lookbehind, but other implementations of regex can. So something like this would match the invalid lines that I want to match, but it doesn't work in php:
/^ +\w+([A-Z][a-z+]) += +\w+(?<!\1)$/mg
Is there anything else that would work, other than matching all of three lines and looping through the matches in a php foreach?
Try using using a negative lookahead instead of a negative lookbehind. It works equally well, plus it works in PHP.
^ +\w+([A-Z][a-z]+) += +(?!\w+\1).*$
regex101 demo
PHP demo
One option would be to, right before each repeated \w
after the =
, use negative lookahead for \1$
:
^ +\w+([A-Z][a-z]+) += +(?:(?!\1$)\w)+$
^^^^^^^^^^^^^^
https://regex101.com/r/lon87L/2
But that only excludes a match if the backreference occurs right at the end of the string. If you want to ensure that the previously matched phrase doesn't occur anywhere within the final \w
s, just remove the $
from inside the repeated group:
^ +\w+([A-Z][a-z]+) += +(?:(?!\1)\w)+$
^
https://regex101.com/r/lon87L/3
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With