Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python Regex MULTILINE option not working correctly?

I'm writing a simple version updater in Python, and the regex engine is giving me mighty troubles.

In particular, ^ and $ aren't matching correctly even with re.MULTILINE option. The string matches without the ^ and $, but no joy otherwise.

I would appreciate your help if you can spot what I'm doing wrong.

Thanks

target.c

somethingsomethingsomething
    NOTICE_TYPE revision[] = "A_X1_01.20.00";
somethingsomethingsomething

versionUpdate.py

fileName = "target.c"
newVersion = "01.20.01"
find = '^(\s+NOTICE_TYPE revision\[\] = "A_X1_)\d\d+\.\d\d+\.\d\d+(";)$'
replace = "\\1" + newVersion + "\\2"

file = open(fileName, "r")
fileContent = file.read()
file.close()

find_regexp = re.compile(find, re.MULTILINE)
file = open(fileName, "w")
file.write( find_regexp.sub(replace, fileContent) )
file.close()

Update: Thank you John and Ethan for a valid point. However, the regexp still isn't matching if I keep $. It works again as soon as I remove $.

like image 927
Calvin Avatar asked Feb 24 '23 06:02

Calvin


1 Answers

Change your replace to:

replace = r'\g<1>' + newVersion + r'\2'

The problem you're having is your version results in this:

replace = "\\101.20.01\\2"

which is confusing the sub call as there is no field 101. From the documentation for the Python re module:

\g<number> uses the corresponding group number; \g<2> is therefore equivalent to \2, but isn’t ambiguous in a replacement such as \g<2>0. \20 would be interpreted as a reference to group 20, not a reference to group 2 followed by the literal character '0'.

like image 145
John Gaines Jr. Avatar answered Mar 02 '23 14:03

John Gaines Jr.