Python comparing string against several regular expressions

Tags:

I'm pretty experienced with Perl and Ruby but new to Python so I'm hoping someone can show me the Pythonic way to accomplish the following task. I want to compare several lines against multiple regular expressions and retrieve the matching group. In Ruby it would be something like this:

# Revised to show variance in regex and related action.
data, foo, bar = [], nil, nil
input_lines.each do |line|
  if line =~ /Foo(\d+)/
    foo = $1.to_i
  elsif line =~ /Bar=(.*)$/
    bar = $1
  elsif bar
    data.push(line.to_f)
  end
end

My attempts in Python are turning out pretty ugly because the matching group is returned from a call to match/search on a regular expression and Python has no assignment in conditionals or switch statements. What's the Pythonic way to do (or think!) about this problem?

449

asked Apr 13 '10 22:04

maerics

2 Answers

Something like this, but prettier:

regexs = [re.compile('...'), ...]

for regex in regexes:
  m = regex.match(s)
  if m:
    print m.groups()
    break
else:
  print 'No match'

175

answered Sep 28 '22 13:09

Ignacio Vazquez-Abrams

There are several ways to "bind a name on the fly" in Python, such as my old recipe for "assign and test"; in this case I'd probably choose another such way (assuming Python 2.6, needs some slight changes if you're working with an old version of Python), something like:

import re
pats_marks = (r'^A:(.*)$', 'FOO'), (r'^B:(.*)$', 'BAR')
for line in lines:
    mo, m = next(((mo, m) for p, m in pats_mark for mo in [re.match(p, line)] if mo),
                 (None, None))
    if mo: print '%s: %s' % (m, mo.group(1))
    else: print 'NO MATCH: %s' % line

Many minor details can be adjusted, of course (for example, I just chose (.*) rather than (.*?) as the matching group -- they're equivalent given the immediately-following $ so I chose the shorter form;-) -- you could precompile the REs, factor things out differently than the pats_mark tuple (e.g., with a dict indexed by RE patterns), etc.

But the substantial ideas, I think, are to make the structure data-driven, and to bind the match object to a name on the fly with the subexpression for mo in [re.match(p, line)], a "loop" over a single-item list (genexps bind names only by loop, not by assignment -- some consider using this part of genexps' specs to be "tricky", but I consider it a perfectly acceptable Python idiom, esp. since it was considered back in the time when listcomps, genexps' "ancestors" in a sense, were being designed).

answered Sep 28 '22 13:09

Alex Martelli

Related questions
                            
                                How to delete records using a query that includes joins to aliased tables in SQLAlchemy 2.0 syntax?
                            
                                Kill a python subprocess that does not return
                            
                                Best practices for manipulating database result sets in Python?
                            
                                Best practice for integrating CherryPy web-framework, SQLAlchemy sessions and lighttpd to serve a high-load webservice
                            
                                What's a good data model for cross-tabulation?
                            
                                Optimize .png images with PIL
                            
                                Making a Python/GTK CheckMenuItem, when clicked, not close the menu
                            
                                How to set django upload_handler in admin?
                            
                                Directory layout for a Python project with C extension modules

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Python comparing string against several regular expressions

Tags:

python

regex

switch-statement

maerics

People also ask

2 Answers

Ignacio Vazquez-Abrams

Alex Martelli

Recent Activity

Donate For Us