Regexes containing meaningful spaces break when re.VERBOSE is added, apparently because re.VERBOSE 'helpfully' magics away the (meaningful) whitespace inside 'Issue Summary', as well as all the crappy non-meaningful whitespace (e.g. padding and newlines inside a (multiline) pattern). (My use of re.VERBOSE with multiline is non-negotiable - this is actually a massive simplification of a huge multiline regex where re.VERBOSE is necessary just to stay sane.)
import re
re.match(r'''Issue Summary.*''', 'Issue Summary: fails''', re.U|re.VERBOSE)
# No match!
re.match(r'''Issue Summary.*''', 'Issue Summary: passes''', re.U)
<_sre.SRE_Match object at 0x10ba36030>
re.match(r'Issue Summary.*', 'Issue Summary: passes''', re.U)
<_sre.SRE_Match object at 0x10b98ff38>
Is there a saner alternative to write re.VERBOSE-friendly patterns containing meaningful spaces, short of replacing each instance in my pattern with '\s' or '.', which is not just ugly but counter-intuitive and a pain to automate?
re.match(r'Issue\sSummary.*''', 'Issue Summary: fails', re.VERBOSE)
<_sre.SRE_Match object at 0x10ba36030>
re.match(r'Issue.Summary.*''', 'Issue Summary: fails', re.VERBOSE)
<_sre.SRE_Match object at 0x10b98ff38>
(As an aside, this a useful docbug catch on Python 2 and 3. I'll file it once I get consensus here on what the right solution is)
re. VERBOSE : This flag allows you to write regular expressions that look nicer and are more readable by allowing you to visually separate logical sections of the pattern and add comments.
VERBOSE as the second argument to re. compile() allow you to do? The re. VERBOSE argument allows you to add whitespace and comments to the string passed to re.
Python has a module named re to work with RegEx. Here's an example: import re pattern = '^a...s$' test_string = 'abyss' result = re. match(pattern, test_string) if result: print("Search successful.") else: print("Search unsuccessful.")
If re.VERBOSE
is used, then I think there's no choice other than to change the regular expression string. However, I would suggest one of the following:
r'abc\ def'
or:
r'abc[ ]def'
Both r'\ '
and '[ ]'
match a single space character (not any whitespace, only an actual space). Note that, without the r
in front, the backslash character would need to be doubled, i.e. \\
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With