Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to use regex lookahead to limit the total length of input string

I have this regular expression and want to add the rule which limit the total length is no more than 15 chars. I saw some lookahead examples but they're not quite clear. Can you help me to modify this expression to support the new rule.

^([A-Z]+( )*[A-Z]+)+$
like image 996
AustinTX Avatar asked Nov 23 '11 17:11

AustinTX


People also ask

How do you restrict length in regex?

The ‹ ^ › and ‹ $ › anchors ensure that the regex matches the entire subject string; otherwise, it could match 10 characters within longer text. The ‹ [A-Z] › character class matches any single uppercase character from A to Z, and the interval quantifier ‹ {1,10} › repeats the character class from 1 to 10 times.

Can I use regex lookahead?

Lookahead assertions are part of JavaScript's original regular expression support and are thus supported in all browsers.

What is lookahead and Lookbehind in regex?

Lookahead allows to add a condition for “what follows”. Lookbehind is similar, but it looks behind. That is, it allows to match a pattern only if there's something before it.


3 Answers

^(?=.{15}$)([A-Z]+( )*[A-Z]+)+$

See it

like image 143
codaddict Avatar answered Sep 30 '22 14:09

codaddict


Since you mentioned it in the title, a negative lookahead for your case would be:

^(?!.{16,})(regex goes here)+$

Note the negative lookahead at the beginning (?!.{16,}) , that checks that the string does not have 16 or more characters.

However, as @TimPietzcker has pointed out your Regex can be simplified a lot, and re-written in such a form that is not prone to backtracking, so you should use his solution.

like image 23
Rich O'Kelly Avatar answered Sep 30 '22 15:09

Rich O'Kelly


Actually, all this can be simplified a lot:

^[A-Z][A-Z ]{0,13}[A-Z]$

does exactly what you want. Or at least what your current regex does (plus the length restriction). This especially avoids problems with catastrophic backtracking which you're setting yourself up for when nesting quantifiers like that.

Case in point:

Try the string ABCDEFGHIJKLMNOP against your original regex. The regex engine will match that instantly. Now try the string ABCDEFGHIJKLMNOPa. It will take the regex engine nearly 230,000 steps to figure out it can't match the string. And each additional character doubles the number of steps needed to determine a failed match.

like image 21
Tim Pietzcker Avatar answered Sep 30 '22 15:09

Tim Pietzcker