I would like to match entire line in a multi-line string (this code is part of unit test that checks the correct output format).
Python 3.5.2 (default, Nov 12 2018, 13:43:14) 
[GCC 5.4.0 20160609] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import re
>>> re.match(r".*score = 0\.59.*", r"score = 0.65\nscore = 0.59\nscore = 1.0", re.MULTILINE)
<_sre.SRE_Match object; span=(0, 39), match='score = 0.65\\nscore = 0.59\\nscore = 1.0'>
This works fine, i can match anything within multiline string. However, i would like to make sure that i match entire line. The documentation sais that the ^ and $ should match the beginning and end of line when re.MULTILINE is used. However, this somehow does not work for me:
>>> re.match(r".*^score = 0\.59$.*", r"score = 0.65\nscore = 0.59\nscore = 1.0", re.MULTILINE)
>>> 
Here are a few more experiments i made:
>>> import os
>>> re.match(r".*^score = 0\.59$.*", "score = 0.65{}score = 0.59{}score = 1.0".format(os.linesep, os.linesep), re.MULTILINE)
>>>
>>> re.match(r".*^score = 0\.65$.*", "score = 0.65{}score = 0.59{}score = 1.0".format(os.linesep, os.linesep), re.MULTILINE)
<_sre.SRE_Match object; span=(0, 12), match='score = 0.65'>
>>> re.match(r".*^score = 0\.65$.*", r"score = 0.65\nscore = 0.59\nscore = 1.0", re.MULTILINE)
>>> 
I guess i'm missing something rather simple, but couldn't figure that out.
problem is that since you're using raw strings for your string, \n is seen as ... well \ then n. Regexes will understand \n in the pattern, but not in the input string.
Also, even if not important there, always use flags= keyword, as some regex functions have an extra count parameter and that can lead to errors.
like this:
re.match(r".*^score = 0\.65$.*", "score = 0.65\nscore = 0.59\nscore = 1.0", flags=re.MULTILINE)
<_sre.SRE_Match object; span=(0, 12), match='score = 0.65'>
and as I noted in comments, .* needs re.DOTALL to match newlines
>>> re.match(r".*^score = \d+\.\d+$.*", "score = 0.65\nscore = 0.59\nscore = 1.0", re.MULTILINE|re.DOTALL)
<_sre.SRE_Match object; span=(0, 37), match='score = 0.65\nscore = 0.59\nscore = 1.0'>
(as noted in Python regex, matching pattern over multiple lines.. why isn't this working? and How do I match any character across multiple lines in a regular expression? of which this could be a duplicate if it wasn't for the raw string bit)
(sorry, my floating point regex is probably a bit weak, you can find better ones around)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With