Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Optional grouping in a simple python regex

All I want to do is search a string for instances of two consecutive digits. If such an instance is found I want to group it, otherwise return none for that particular groups. I thought this would be trivial, but I can't understand where I'm going wrong. In the example below, removing the optional (?) character gets me the numbers, but in strings without numbers, the r evaluates to None, so r.groups() throws an exception.

p = re.compile(r'(\d{2})?')
r = p.search('wqddsel78ffgr')
print r.groups()
>>>(None, )    # why not ('78', )?

# --- update/clarification --- #

Thanks for the answers, but the explanations given are leaving me none-the-wiser. Here's a another go at pin-pointing exactly what it is I don't understand.

pattern = re.compile(r'z.*(A)?')
_string = "aazaa90aabcdefA"
result = pattern.search(_string)
result.group()
>>> zaa90aabcdefA
result.groups()
>>> (None, )

I understand why result.group() produces the result it does, but why doesn't result.groups() produce ('A', )? I thought it worked like this: once the regex hits the z it then matches right to the end of the line using .*. In spite of .* matching everything, the regex engine is aware that it passed over an optional group, and since ? means it will try to match if it can, it should work backwards to try and match. Replacing ? with + does return ('A', ). This suggests that ? won't try and match if it doesn't have to, but this seems to contrast with much of what I've read on the subject (esp. J. Friedl's excellent book).

like image 262
Paul Patterson Avatar asked May 21 '26 09:05

Paul Patterson


1 Answers

This works for me:

p = re.compile('\D*(\d{2})?')
r = p.search('wqddsel78ffgr')
print r.groups()  # ('78',)

r = p.search('wqddselffgr')
print r.groups()  # (None,)
like image 119
Eric Avatar answered May 23 '26 23:05

Eric