I'm noticing some odd behavior in Python's Regex library, and I'm not sure if I'm doing something wrong.
If I run a regex on it using re.sub()
, with re.MULTILINE
. It seems to only replace the first few occurrences. It replaces all occurrences if I turn off re.MULTILINE
, use re.subn(..., count = 0, flags = re.MULTILINE)
, or compile the regex using re.compile(..., re.MULTILINE)
.
I am running Python 2.7 on Ubuntu 12.04.
I've posted a random example on:
Can someone confirm / deny this behavior on their machine?
EDIT: Realized I should go ahead and post this on the Python bug tracker. EDIT 2: Issue reported: http://bugs.python.org/msg168909
The re. MULTILINE search modifier forces the ^ symbol to match at the beginning of each line of text (and not just the first), and the $ symbol to match at the end of each line of text (and not just the last one). The re. MULTILINE search modifier takes no arguments.
re. sub() function is used to replace occurrences of a particular sub-string with another sub-string. This function takes as input the following: The sub-string to replace.
sub() function belongs to the Regular Expressions ( re ) module in Python. It returns a string where all matching occurrences of the specified pattern are replaced by the replace string.
The 'r' at the start of the pattern string designates a python "raw" string which passes through backslashes without change which is very handy for regular expressions (Java needs this feature badly!). I recommend that you always write pattern strings with the 'r' just as a habit.
Use
re.sub(pattern, replace, text, flags=re.MULTILINE)
instead of
re.sub(pattern, replace, text, re.MULTILINE)
which is equivalent to
re.sub(pattern, replace, text, count=re.MULTILINE)
which is a bug in your code.
See re.sub()
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With