Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What are the differences between lazy, greedy and possessive quantifiers?

How do the following quantifiers differ - with respect of scenarios, speed, etc.

  • ?, ?? and ?+ all match 0 or 1 times.
  • *, *? and*+` all match 0 or more times.
  • +, +? and ++ all match 1 or more times.

  • ?, * and + are greedy.
  • ??, *? and +? are reluctant/lazy.
  • ?+, *+ and ++ are possessive.

Can anyone help me to understand what these terms mean? Why are there three variations of each quantifier for the same job?

like image 580
DoLoveSky Avatar asked Jan 15 '13 20:01

DoLoveSky


People also ask

What are lazy quantifiers?

The lazy mode of quantifiers is an opposite to the greedy mode. It means: “repeat minimal number of times”. We can enable it by putting a question mark '?' after the quantifier, so that it becomes *? or +? or even ?? for '?' .

What is a possessive quantifier?

Like a greedy quantifier, a possessive quantifier repeats the token as many times as possible. Unlike a greedy quantifier, it does not give up matches as the engine backtracks. With a possessive quantifier, the deal is all or nothing.

What are greedy quantifiers?

Greedy Quantifier (Default) Greedy quantifiers try to match the longest text that matches a given pattern. Greedy quantifiers work by first reading the entire string before trying any match. If the whole text doesn't match, remove the last character and try again, repeating the process until a match is found.

What is pattern greedy vs non greedy matching?

So the difference between the greedy and the non-greedy match is the following: The greedy match will try to match as many repetitions of the quantified pattern as possible. The non-greedy match will try to match as few repetitions of the quantified pattern as possible.


1 Answers

Take the string

aaaab

and see how the following regexes match it:

Regex          Submatches
               group 1  group 2  group3
(a?)(a*)(ab)   a        aa       ab
(a??)(a*)(ab)           aaa      ab
(a?+)(a*)(ab)  a        aa       ab
(a*)(a?)(ab)   aaa               ab
(a*?)(a?)(ab)  aa       a        ab
(a*+)(a?)(ab)  aaaa              <Match fails!>
(a+)(a*)(ab)   aaa               ab 
(a+?)(a*)(ab)  a        aa       ab
(a++)(a*)(ab)  aaaa              <Match fails!>

Explanation:

  • a? tries to match one a, but it's prepared to match nothing if that's necessary for the whole match to succeed.
  • a?? tries to match nothing, but it's prepared to match one a if that's necessary for the whole match to succeed.
  • a?+ tries to match one a. If it can do that, it will not back down to match nothing if that were necessary for the overall match to succeed. If it can't match an a, then it will gladly match nothing, though.
  • a* tries to match as many as as it can, but it's prepared to match fewer as, even nothing if that's necessary for the whole match to succeed.
  • a*? tries to match nothing, but it's prepared to match just as many as as is absolutely necessary in order for the whole match to succeed, but not more.
  • a*+ tries to match as many as as it can. If it can do that, it will not back down to match fewer as if that were necessary for the overall match to succeed. If it can't match even a single a, then it will gladly match nothing, though.
  • a+ tries to match as many as as it can, but it's prepared to match fewer as (but at least one) if that's necessary for the whole match to succeed.
  • a+? tries to match only one a, but it's prepared to match just as many as as is absolutely necessary in order for the whole match to succeed, but not more.
  • a++ tries to match as many as as it can. If it can do that, it will not back down to match fewer as if that were necessary for the overall match to succeed. If it can't match even a single a, then the regex fails.
like image 122
Tim Pietzcker Avatar answered Nov 15 '22 04:11

Tim Pietzcker