Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Regular expression combination of quantifiers *?

Tags:

regex

What does this combination of quantifiers *? mean?

Use this as the following example:

([0-9][AB]*?)
like image 573
akaii Avatar asked Mar 12 '23 15:03

akaii


2 Answers

It's a non-greedy match. In [AB]*?, the regex looks for as few occurrences of [AB] as needed to make the overall regex match the searched string, whereas the greedy version [AB]* looks for as many occurrences as possible. It is a feature of Perl's regexes, and hence available in PCRE (Perl Compatible Regular Expressions) (see repetition) and other systems that look to Perl for their definition.

The PCRE page gives an example:

The classic example of where [greediness] gives problems is in trying to match comments in C programs. These appear between /* and */ and within the comment, individual * and / characters may appear. An attempt to match C comments by applying the pattern:

/\*.*\*/

to the string

/* first comment */  not comment  /* second comment */

fails, because it matches the entire string owing to the greediness of the .* item.

If a quantifier is followed by a question mark, it ceases to be greedy, and instead matches the minimum number of times possible, so the pattern

/\*.*?\*/

does the right thing with the C comments.

like image 152
Jonathan Leffler Avatar answered Mar 24 '23 14:03

Jonathan Leffler


Jonathan already explained the difference, but here's an example that might help you understand what's happening here.

Given the string "9AB":

  • ([0-9][AB]*?) matches only "9A" because it stop as soon as "A" matched (lazy)

  • ([0-9][AB]*) matches the whole string ("9AB") because it consumes "A" and successes to match the following "B" (greedy)

Note that the second one will match a digit, followed by zero or more (unlimited) number of "A" or "B"s.

like image 38
Maroun Avatar answered Mar 24 '23 14:03

Maroun