Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Bug in Python Regex? (re.sub with re.MULTILINE)

Tags:

python

regex

I'm noticing some odd behavior in Python's Regex library, and I'm not sure if I'm doing something wrong.

If I run a regex on it using re.sub(), with re.MULTILINE. It seems to only replace the first few occurrences. It replaces all occurrences if I turn off re.MULTILINE, use re.subn(..., count = 0, flags = re.MULTILINE), or compile the regex using re.compile(..., re.MULTILINE).

I am running Python 2.7 on Ubuntu 12.04.

I've posted a random example on:

  • Pastebin.com - Output from terminal
  • codepad - Script, confirming behavior (except for re.subn(), which is different on 2.5)

Can someone confirm / deny this behavior on their machine?

EDIT: Realized I should go ahead and post this on the Python bug tracker. EDIT 2: Issue reported: http://bugs.python.org/msg168909

like image 410
eacousineau Avatar asked Aug 22 '12 23:08

eacousineau


People also ask

What is re multiline in Python?

The re. MULTILINE search modifier forces the ^ symbol to match at the beginning of each line of text (and not just the first), and the $ symbol to match at the end of each line of text (and not just the last one). The re. MULTILINE search modifier takes no arguments.

What does re sub () do?

re. sub() function is used to replace occurrences of a particular sub-string with another sub-string. This function takes as input the following: The sub-string to replace.

What does re Sub return in Python?

sub() function belongs to the Regular Expressions ( re ) module in Python. It returns a string where all matching occurrences of the specified pattern are replaced by the replace string.

What is r in regex Python?

The 'r' at the start of the pattern string designates a python "raw" string which passes through backslashes without change which is very handy for regular expressions (Java needs this feature badly!). I recommend that you always write pattern strings with the 'r' just as a habit.


1 Answers

Use

re.sub(pattern, replace, text, flags=re.MULTILINE) 

instead of

re.sub(pattern, replace, text, re.MULTILINE) 

which is equivalent to

re.sub(pattern, replace, text, count=re.MULTILINE)

which is a bug in your code.

See re.sub()

like image 118
jfs Avatar answered Nov 19 '22 19:11

jfs