Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do capture groups work? (wrt python regular expressions)

Tags:

python

regex

While using regex to help solve a problem in the Python Challenge, I came across some behaviour that confused me.

from here:

(...) Matches whatever regular expression is inside the parentheses.

and

'+' Causes the resulting RE to match 1 or more repetitions of the preceding RE.

So this makes sense:

>>>import re
>>>re.findall(r"(\d+)", "1111112")
['1111112']

But this doesn't:

>>> re.findall(r"(\d)+", "1111112")
['2']

I realise that findall returns only groups when groups are present in the regex, but why is only the '2' returned? What happends to all the 1's in the match?

like image 543
Ej. Avatar asked Apr 21 '26 18:04

Ej.


1 Answers

Because you only have one capturing group, but it's "run" repeatedly, the new matches are repeatedly entered into the "storage space" for that group. In other words, the 1s were lost when they were "overwritten" by subsequent 1s and eventually the 2.

like image 108
Ben Blank Avatar answered Apr 24 '26 09:04

Ben Blank



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!