Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Regexp causes to hang infinitely

Tags:

regex

groovy

I have following regexp, which never evaluates and hangs infinitely:

import java.util.regex.Matcher
String AUTOGENERATED_HEADER = "#-=-=-= AUTOGENERATED HEADER =-=-=-"
String AUTOGENERATED_FOOTER = "#-=-=-= AUTOGENERATED FOOTER =-=-=-"

String messages = '''#-=-=-= AUTOGENERATED HEADER =-=-=-
a=b
c=d
x=y
#-=-=-= AUTOGENERATED FOOTER =-=-=-
'''


Matcher matcher = messages =~ /${AUTOGENERATED_HEADER}[\r\n]+((.*[\r\n]*)*)${AUTOGENERATED_FOOTER}/
matcher.find()​

The problem is with part (.*[\r\n]*). When I change it to (.*[\r\n]+), it works.

You can experiment with regexp here. Can anybody explain how is it possible ?

like image 501
Tomas Bartalos Avatar asked Oct 17 '25 14:10

Tomas Bartalos


1 Answers

What you have here is a case of a catastrophical backtracking. See your regex demo. The culprit is the (.*[\r\n]*)* part that is enclosed with other subpatterns. The nested quantifiers cause too much backtracking that you can see on the regex debugger page at regex101.com.

A solution is to either use lazy dot matching: replace [\r\n]+((.*[\r\n]*)*) with .*? and add an (?s) modifier at the start of the pattern, or use an unrolled version (which is much better for long inputs, but requires some hardcoding).

See (?s)#-=-=-= AUTOGENERATED HEADER =-=-=-.*?#-=-=-= AUTOGENERATED FOOTER =-=-=- in action. Use

Matcher matcher = messages =~ /(?s)${AUTOGENERATED_HEADER}.*?${AUTOGENERATED_FOOTER}/
like image 169
Wiktor Stribiżew Avatar answered Oct 20 '25 06:10

Wiktor Stribiżew



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!