I'm asking in the general for a tool or method to find the "hot spot" in a regex which causes uncontrolled backtracking. I have a fairly good grasp of possessive matching, negative lookahead assertions, atomic groups etc but I'm facing a situation where it's not clear where exactly my regex is wrong.
The problematic regex is a PCRE regex; but I'll be happy for any pointers for any language.
Ideally, I'd like to see a tool which would highlight the "hot spots" in a regex. I have in the past tried to create a wrapper for perl -Mre=debug but couldn't really figure out how I should usefully process its output. Vaguely, the idea would be to run one or more input strings against a regex, and collect the offset in the regex (as well as perhaps the offsets in the string) which the matcher keeps on coming back to.
Damian Conway's brand new Regexp::Debugger module for Perl lets you watch an animation of your regex being matched against a string. It should make it fairly easy to spot excessive backtracking. Just install it and use the included rxrx script that lets you enter a regex and a string to match it against.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With