The following code blocks on my system. Why?
System.out.println( Pattern.compile(
"^((?:[^'\"][^'\"]*|\"[^\"]*\"|'[^']*')*)/\\*.*?\\*/(.*)$",
Pattern.MULTILINE | Pattern.DOTALL ).matcher(
"\n\n\n\n\n\nUPDATE \"$SCHEMA\" SET \"VERSION\" = 12 WHERE NAME = 'SOMENAMEVALUE';"
).matches() );
The pattern (designed to detect comments of the form /*...*/
but not within '
or "
) should be fast, as it is deterministic...
Why does it take soooo long?
You're running into catastrophic backtracking.
Looking at your regex, it's easy to see how .*?
and (.*)
can match the same content since both also can match the intervening \*/
part (dot matches all, remember). Plus (and even more problematic), they can also match the same stuff that ((?:[^'"][^'"]*|"[^"]*"|'[^']*')*)
matches.
The regex engine gets bogged down in trying all the permutations, especially if the string you're testing against is long.
I've just checked your regex against your string in RegexBuddy. It aborts the match attempt after 1.000.000 steps of the regex engine. Java will keep churning on until it gets through all permutations or until a Stack Overflow occurs...
You can greatly improve the performance of your regex by prohibiting backtracking into stuff that has already been matched. You can use atomic groups for this, changing your regex into
^((?>[^'"]+|"[^"]*"|'[^']*')*)(?>/\*.*?\*/)(.*)$
or, as a Java string:
"^((?>[^'\"]+|\"[^\"]*\"|'[^']*')*)(?>/\\*.*?\\*/)(.*)$"
This reduces the number of steps the regex engine has to go through from > 1 million to 58.
Be advised though that this will only find the first occurrence of a comment, so you'll have to apply the regex repeatedly until it fails.
Edit: I just added two slashes that were important for the expressions to work. Yet I had to change more than 6 characters.... :(
I recommend that you read Regular Expression Matching Can Be Simple And Fast (but is slow in Java, Perl, PHP, Python, Ruby, ...).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With