I have this regular expression ^(?!.*?([aceg]).*?\1)(?!.*?([i])(?:.*?\2){2})[acegi]+$ which works as expected (e.g. in Ruby), but not in PostgreSQL due to "invalid backreference number".
How to solve it and keep given functionality?
Part of the SQL command: WHERE (name ~ '^(?!.*?([aceg]).*?\1)(?!.*?([i])(?:.*?\2){2})[acegi]+$')
Note: I tried escaping backslash like \\ without any error, but PG were returning invalid matches (like "aaa").
The problem with Postgresql is that first, it doesn't support having capture groups within its lookahead assertions. That said, all capture groups within a lookahead will be treated as non-capture groups ((?: ... )), emphasis mine:
Lookahead constraints cannot contain back references (see Section 9.7.3.3), and all parentheses within them are considered non-capturing.[1]
So even if PostgreSQL did support having backreferences within a lookahead, it would still be failing to work as expected due to the above constraint (without a capture group, you cannot have a backreference).
A possible workaround (will be lengthy for complex requirements unfortunately) would be to count the number of each character:
WHERE
LENGTH(REGEXP_REPLACE(name, '[^a]+', '', 'g')) < 2 AND
LENGTH(REGEXP_REPLACE(name, '[^c]+', '', 'g')) < 2 AND
LENGTH(REGEXP_REPLACE(name, '[^e]+', '', 'g')) < 2 AND
LENGTH(REGEXP_REPLACE(name, '[^g]+', '', 'g')) < 2 AND
LENGTH(REGEXP_REPLACE(name, '[^i]+', '', 'g')) < 3 AND
LENGTH(REGEXP_REPLACE(name, '[acegi]+', '', 'g')) = 0;
[condition taken and modified from this answer; the last row is to ensure there are only those characters in the string]
Lookahead constraints may not contain back references.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With