I'm pretty experienced with Perl and Ruby but new to Python so I'm hoping someone can show me the Pythonic way to accomplish the following task. I want to compare several lines against multiple regular expressions and retrieve the matching group. In Ruby it would be something like this:
# Revised to show variance in regex and related action.
data, foo, bar = [], nil, nil
input_lines.each do |line|
if line =~ /Foo(\d+)/
foo = $1.to_i
elsif line =~ /Bar=(.*)$/
bar = $1
elsif bar
data.push(line.to_f)
end
end
My attempts in Python are turning out pretty ugly because the matching group is returned from a call to match/search on a regular expression and Python has no assignment in conditionals or switch statements. What's the Pythonic way to do (or think!) about this problem?
Python comparison operators can be used to compare strings in Python. These operators are: equal to ( == ), not equal to ( != ), greater than ( > ), less than ( < ), less than or equal to ( <= ), and greater than or equal to ( >= ).
Python string comparison is performed using the characters in both strings. The characters in both strings are compared one by one. When different characters are found then their Unicode value is compared. The character with lower Unicode value is considered to be smaller.
Python String Comparison operators In python language, we can compare two strings such as identify whether the two strings are equivalent to each other or not, or even which string is greater or smaller than each other.
Something like this, but prettier:
regexs = [re.compile('...'), ...]
for regex in regexes:
m = regex.match(s)
if m:
print m.groups()
break
else:
print 'No match'
There are several ways to "bind a name on the fly" in Python, such as my old recipe for "assign and test"; in this case I'd probably choose another such way (assuming Python 2.6, needs some slight changes if you're working with an old version of Python), something like:
import re
pats_marks = (r'^A:(.*)$', 'FOO'), (r'^B:(.*)$', 'BAR')
for line in lines:
mo, m = next(((mo, m) for p, m in pats_mark for mo in [re.match(p, line)] if mo),
(None, None))
if mo: print '%s: %s' % (m, mo.group(1))
else: print 'NO MATCH: %s' % line
Many minor details can be adjusted, of course (for example, I just chose (.*)
rather than (.*?)
as the matching group -- they're equivalent given the immediately-following $
so I chose the shorter form;-) -- you could precompile the REs, factor things out differently than the pats_mark
tuple (e.g., with a dict indexed by RE patterns), etc.
But the substantial ideas, I think, are to make the structure data-driven, and to bind the match object to a name on the fly with the subexpression for mo in [re.match(p, line)]
, a "loop" over a single-item list (genexps bind names only by loop, not by assignment -- some consider using this part of genexps' specs to be "tricky", but I consider it a perfectly acceptable Python idiom, esp. since it was considered back in the time when listcomps, genexps' "ancestors" in a sense, were being designed).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With