All I want to do is search a string for instances of two consecutive digits. If such an instance is found I want to group it, otherwise return none for that particular groups. I thought this would be trivial, but I can't understand where I'm going wrong. In the example below, removing the optional (?) character gets me the numbers, but in strings without numbers, the r evaluates to None, so r.groups() throws an exception.
p = re.compile(r'(\d{2})?')
r = p.search('wqddsel78ffgr')
print r.groups()
>>>(None, ) # why not ('78', )?
# --- update/clarification --- #
Thanks for the answers, but the explanations given are leaving me none-the-wiser. Here's a another go at pin-pointing exactly what it is I don't understand.
pattern = re.compile(r'z.*(A)?')
_string = "aazaa90aabcdefA"
result = pattern.search(_string)
result.group()
>>> zaa90aabcdefA
result.groups()
>>> (None, )
I understand why result.group() produces the result it does, but why doesn't result.groups() produce ('A', )? I thought it worked like this: once the regex hits the z it then matches right to the end of the line using .*. In spite of .* matching everything, the regex engine is aware that it passed over an optional group, and since ? means it will try to match if it can, it should work backwards to try and match. Replacing ? with + does return ('A', ). This suggests that ? won't try and match if it doesn't have to, but this seems to contrast with much of what I've read on the subject (esp. J. Friedl's excellent book).
This works for me:
p = re.compile('\D*(\d{2})?')
r = p.search('wqddsel78ffgr')
print r.groups() # ('78',)
r = p.search('wqddselffgr')
print r.groups() # (None,)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With