Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

The influence of ? in the regex string

Tags:

python

regex

Consider the following Python code:

>>> re.search(r'.*(99)', 'aa99bb').groups()
('99',)
>>> re.search(r'.*(99)?', 'aa99bb').groups()
(None,)

I don't understand why I don't catch 99 in the second example.

like image 246
Sylvain Avatar asked Dec 22 '22 16:12

Sylvain


1 Answers

This is because the .* first matches the entire string. At that point, it's not possible to match 99 any more, and since the group is optional, the regex engine stops because it has found a successful match.

If on the other hand the group is mandatory, the regex engine has to backtrack into the .*.

Compare the following debug sessions from RegexBuddy (the part of the string matched by .* is highlighted in yellow, the part matched by (99) in blue):

.*(99):

enter image description here


.*(99)?:

enter image description here

like image 132
Tim Pietzcker Avatar answered Jan 04 '23 18:01

Tim Pietzcker