Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Syntax for multiple positive lookaheads in JavaScript regex

I'm trying to include two positive lookaheads in one regex. Here's the problem I'm working on as an example.

(?=[a-zA-Z])(?=[0-9])[a-zA-Z0-9]{0,20}

This is what I'm trying to match:

  • 0-20 characters
  • one or more letter anywhere
  • one or more number anywhere
  • only letters and numbers allowed

When I do this with only one lookahead, it works, but as soon as I add the other, it breaks. What's the correct syntax for two lookaheads?

like image 270
Qaz Avatar asked Sep 30 '14 20:09

Qaz


People also ask

What is a positive Lookbehind in regex?

There are two types of lookarounds: Lookbehind, which is used to match a phrase that is preceded by a user specified text. Positive lookbehind is syntaxed like (? <=a)something which can be used along with any regex parameter. The above phrase matches any "something" word that is preceded by an "a" word.

What is ?: In regex?

It indicates that the subpattern is a non-capture subpattern. That means whatever is matched in (?:\w+\s) , even though it's enclosed by () it won't appear in the list of matches, only (\w+) will.

Does JavaScript regex support Lookbehind?

JavaScript doesn't support any lookbehind, but it can support lookaheads.

What is lookahead in regex JavaScript?

The syntax is: X(?= Y) , it means "look for X , but match only if followed by Y ". There may be any pattern instead of X and Y . For an integer number followed by € , the regexp will be \d+(?=


2 Answers

Lookaheads are like wanders! You limited the domain of looks at the first place which won't fulfill the requirement. You may use a greedy dot .* (or lazy .*?) regex to allow a lookahead to look for each requirement.

As @AlexR mentioned in comments I modify the RegEx a little bit:

^(?=.*[a-zA-Z])(?=.*[0-9])[a-zA-Z0-9_]{0,20}$

By the way, you forgot matching underscores, which I added.

The above is almost equal to:

^(?=[^a-zA-Z]*[a-zA-Z])(?=\D*\d)\w{1,20}$
like image 69
revo Avatar answered Sep 23 '22 02:09

revo


A problem with @revos answer occurs when the input is too long: 01234567890123456789A passes both lookaheads and the final check. A fixed version either checks for end-of-string with ^ and $ or uses variable-length lookaround (or both):

^(?=.{0,19}[a-zA-Z])(?=.{0,19}[0-9])[a-zA-Z0-9]{0,20}$ // (1), (1*) without ^
^(?=.*[a-zA-Z])(?=.*[0-9])[a-zA-Z0-9]{0,20}$
(?=.{0,19}[a-zA-Z])(?=.{0,19}[0-9])[a-zA-Z0-9]{0,20} // (2)

Only the latter will allow text around the specified string. Omitting the ^ in the former variants will allow the password to be prefixed, i.e.

Input            : "Password1 = ASDF0123"
Matches with (1) : none
Matches with (1*): "ASDF0123"
Matches with (2) : "Password1", "ASDF0123"
like image 42
AlexR Avatar answered Sep 23 '22 02:09

AlexR