My apologies for a completely newbie question. I did try searching stackoverflow first before posting this question.
I am trying to learn regex using python from diveintopython3.net. While fiddling with the examples, I failed to understand one particular output for a regex search (shown below):
>>> pattern = 'M?M?M?$'
>>> re.search(pattern,'MMMMmmmmm')
<_sre.SRE_Match object at 0x7f0aa8095168>
Why does the above regex pattern match the input text? My understanding is that the $ character should match only at the end of the string. But the input text ends with 'mmmm'
. So i though the patterns should not match.
My python version is :
Python 3.3.2 (default, Dec 4 2014, 12:49:00)
[GCC 4.8.3 20140911 (Red Hat 4.8.3-7)] on linux
EDIT: Attached a screenshot from Debuggex.
Why does the above regex pattern match the input text?
Because you made the previous M
's as optional. M?
refers an optional M
. M
may or maynot present. So the above regex 'M?M?M?$'
matches only the zero width end of the line boundary. Hence you got a match.
It is because all the M
symbols are optional, and $
(the only required symbol in this regex) matches at the end. You have a regex that is equal to zero-length assertion, that captures no characters but still there are matches.
Here is a visualization:
M?M?M?$
Debuggex Demo
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With