One of the biggest annoyances I find in Python is the inability of the re
module to save its state without explicitly doing it in a match object. Often, one needs to parse lines and if they comply a certain regex take out values from them by the same regex. I would like to write code like this:
if re.match('foo (\w+) bar (\d+)', line):
# do stuff with .group(1) and .group(2)
elif re.match('baz whoo_(\d+)', line):
# do stuff with .group(1)
# etc.
But unfortunately it's impossible to get to the matched object of the previous call to re.match
, so this is written like this:
m = re.match('foo (\w+) bar (\d+)', line)
if m:
# do stuff with m.group(1) and m.group(2)
else:
m = re.match('baz whoo_(\d+)', line)
if m:
# do stuff with m.group(1)
Which is rather less convenient and gets really unwieldy as the list of elif
s grows longer.
A hackish solution would be to wrap the re.match and re.search in my own objects that keep state somewhere. Has anyone used this? Are you aware of semi-standard implementations (in large frameworks or something)?
What other workarounds can you recommend? Or perhaps, am I just misusing the module and could achieve my needs in a cleaner way?
Thanks in advance
A regular expression (or RE) specifies a set of strings that matches it; the functions in this module let you check if a particular string matches a given regular expression (or if a given regular expression matches a particular string, which comes down to the same thing).
Regular Expressions, also known as “regex” or “regexp”, are used to match strings of text such as particular characters, words, or patterns of characters. It means that we can match and extract any string pattern from the text with the help of regular expressions.
Regular expressions are widely used in UNIX world. The Python module re provides full support for Perl-like regular expressions in Python. The re module raises the exception re. error if an error occurs while compiling or using a regular expression.
Python 3.8 has now provided us with a neat solution: :=
(the walrus operator).
It will assign the right-hand value to the left-hand variable, then return the value.
Basically, we can finally fulfill @aaron's wish and simply write:
if m := re.match('foo (\w+) bar (\d+)', line):
# do stuff with m.group(1) and m.group(2)
elif m := re.match('baz whoo_(\d+)', line):
# do stuff with m.group(1)
elif ...
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With