Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Performance issue while evaluating email address with a regular expression

I am using below regular expression to validate email address.

/^\w+([\.-]?\w+)*@\w+([\.-]?w+)*(\.\w{2,3})+$/

Javascript Code:

var email = '[email protected]';

var pattern = /^\w+([\.-]?\w+)*@\w+([\.-]?w+)*(\.\w{2,3})+$/;

if(pattern.test(email)){
    return true;
}

The regex evaluates quickly when I provide the below invalid email:

aseflj#$kajsdfklasjdfklasjdfklasdfjklasdjfaklsdfjaklsdjfaklsfaksdjfkasdasdklfjaskldfjjdkfaklsdfjlak@company.com

(I added #$ in the middle of the name)

However when I try to evaluate this email it takes too much time and the browser hangs.

asefljkajsdfklasjdfklasjdfklasdfjklasdjfaklsdfjaklsdjfaklsfaksdjfkasdasdklfjaskldfjjdkfaklsdfjlak@company.com1

(I added com1 in the end)

I'm sure that the regex is correct but not sure why its taking so much time to evaluate the second example. If I provide an email with shorter length it evaluates quickly. See the below example

[email protected]

Please help me fix the performance issue

like image 988
Pradeep K M Avatar asked Apr 10 '15 00:04

Pradeep K M


People also ask

How do I validate an email address in regex?

Additionally, the second string needs to contain a dot, which has an additional 2-3 characters after that. With that in mind, to generally validate an email address in JavaScript via Regular Expressions, we translate the rough sketch into a RegExp : let regex = new RegExp('[a-z0-9]+@[a-z]+\.

Should you validate email with regex?

Don't use regexes for validating emails, unless you have a good reason not to. Use a verification mail instead. In most cases, a regex that simply checks that the string contains an @ is enough.

Can we use below given regular expression to validate an email address?

Regualr expression is a sequence of character which define a specific pattern and also named as abbreviated regex or regexp and sometimes called a rational expression. we can make regular expression like ( “/ABC/” ,”Ab_123.Cd” ,”abc123. -@&”…)

How do I check if an email address is valid?

If you want to check whether the user typed in a valid email address, replace the word boundaries with start-of-string and end-of-string anchors, like this: ^[A-Z0-9. _%+-]+@[A-Z0-9. -]+\. [A-Z]{2,}$.


1 Answers

Your regex runs into catastrophic backtracking. Since [\.-]? in ([\.-]?\w+)* is optional, it makes the group degenerates to (\w+)*, which is a classic case of catastrophic backtracking.

Remove the ? resolves the issue.

I also remove the redundant escape of . inside character class, and changed the regex a bit.

^\w+([.-]\w+)*@\w+([.-]\w+)*\.\w{2,3}$

Do note that many new generic TLDs have more than 3 characters. Even some of the gTLD before the expansion have more than 3 characters, such as .info.

And as it is, the regex also doesn't support internationalized domain name.

like image 92
nhahtdh Avatar answered Sep 28 '22 03:09

nhahtdh