In my code I generate a regular expression from a list of subexpressions. Joining expressions works fine if I put each of them in a non-matching group (?:…):
# concatenation:
joined_expr = ''.join('(?:{})'.format(expr) for expr in subexpression)
# disjunction:
joined_expr = '|'.join('(?:{})'.format(expr) for expr in subexpression)
Problem is: The result of this joined expression is a subexpression for a bigger expression, and subexpression could be empty, but the joined expression must not match the empty string.
So what would be the easiest why to make a regular expression, that cannot match? Would (?:(?!.).) work? If not, why not? Would Python's re engine understand my attempt to create a failing branch and optimize it?
Spare the time elapsed by the regex engine using:
\Zx # or '$s' to match a literal after the end of the string
It much more simpler than (?:(?!.).) for long strings and you obtain the same result.
Here is a short online test with a text of 4231 chars:
Test negative lookahead - (?:(?!.).) - 16924 steps
Test after end anchor - \Zx - 2 steps
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With