Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Java regex error - Look-behind with group reference

I'm trying to build a regex that matches exactly two occurrences of a char in a class. This is the regex I made:

(?<!\1)([^raol1c])\1(?!\1)

As you can see, it uses negatives look-aheads and behind. But, as usual the latter does not work; java throws the well-known exception "look-behind group does not have an obvious maximum length" when it clearly has a maximum length (exactly one char).

Ideally the regex should match "hh", "jhh", "ahh", "hhj", "hha" but not "hhh".

Any ideas about how to deal with this and make a workaround?

like image 543
user2402372 Avatar asked May 20 '13 16:05

user2402372


1 Answers

Here is a workaround. It's ugly but apparently it works:

(?<!(?=\1).)([^raol1c])\1(?!\1)

Putting the backreference into a zero-length lookahead inside the lookbehind makes the lookbehind certainly of fixed length.

Disclaimer, I did not come up with this (unfortunately): Backreferences in lookbehind

EDIT:

The above pattern does not rule out hhh for some reason. However, this works:

(?<!(.)(?=\1))([^raol1c])\2(?!\2)

If we create the first group inside the lookbehind then we can use this to ensure that the first character after the lookbehind is not the same as the one before it.

Working demo.

like image 88
Martin Ender Avatar answered Oct 04 '22 18:10

Martin Ender