Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Refactor this repetition that can lead to a stack overflow for large inputs Sonar

Tags:

java

regex

I am trying the validate email with the following regex pattern,

 @Pattern(regexp="^$|[a-zA-Z0-9\\+\\.\\_\\%\\-\\+]{1,256}\\@[a-zA-Z0-9][a-zA-Z0-9\\-]{0,64}(\\.[a-zA-Z0-9][a-zA-Z0-9\\-]{0,25})+", message = "Email address invalid")

This regex works as expected, but Sonar highlights it to be critical bug:

Refactor this repetition that can lead to a stack overflow for large inputs.

Can you please help in shortening the regex, but logic should still remain same?

Picture for reference

like image 681
Sonam Gulwani Avatar asked Nov 05 '25 08:11

Sonam Gulwani


1 Answers

I suggest you take a look at ReDoS attacks. The regular expression you have is susceptible to ReDoS attacks due to the nested repetitions.

If you are trying to avoid the Sonar warning and don't care if the regular expression is VERY ugly, you could replace the last capturing group (\\.[a-zA-Z0-9][a-zA-Z0-9\\-]{0,25})+ with something like:

(\\.[a-zA-Z0-9][a-zA-Z0-9\\-]{0,25})(\\.[a-zA-Z0-9][a-zA-Z0-9\\-]{0,25})?(\\.[a-zA-Z0-9][a-zA-Z0-9\\-]{0,25})?(\\.[a-zA-Z0-9][a-zA-Z0-9\\-]{0,25})?

And repeat the "optional" bit as many times as is likely necessary. It's pretty rare for a user to have more than 4-5 subdomains in an email address, but this isn't 100% fool-proof. It seems that you are only allowing an oddly specific "flavor" of email addresses (i.e. all the 0-25 bits). If you get rid of these requirements, you could replace that last group with something link:

[.a-zA-Z0-9\\-]*\\.[a-zA-Z0-9][a-zA-Z0-9\\-]

(note the literal . in the first character class)

As @VGR suggests, the @Email validator is much better suited for email validation. Emails are very complicated to validate.

like image 78
Jeff Brower Avatar answered Nov 06 '25 21:11

Jeff Brower



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!