Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python regex matching in conditionals

Tags:

I am parsing file and I want to check each line against a few complicated regexs. Something like this

if re.match(regex1, line): do stuff elif re.match(regex2, line): do other stuff elif re.match(regex3, line): do still more stuff ... 

Of course, to do the stuff, I need the match objects. I can only think of three possibilities, each of which leaves something to be desired.

if re.match(regex1, line):      m = re.match(regex1, line)     do stuff elif re.match(regex2, line):     m = re.match(regex2, line)     do other stuff ... 

which requires doing the complicated matching twice (these are long files and long regex :/)

m = re.match(regex1, line) if m: do stuff else:     m = re.match(regex2, line)     if m: do other stuff     else:        ... 

which gets terrible as I indent further and further.

while True:     m = re.match(regex1, line)     if m:         do stuff         break     m = re.match(regex2, line)     if m:         do other stuff         break     ... 

which just looks weird.

What's the right way to do this?

like image 483
pythonic metaphor Avatar asked May 04 '11 18:05

pythonic metaphor


People also ask

Are there conditionals in regex?

If-Then-Else Conditionals in Regular Expressions. A special construct (? ifthen|else) allows you to create conditional regular expressions. If the if part evaluates to true, then the regex engine will attempt to match the then part.

What is ?! In regex?

The ?! n quantifier matches any string that is not followed by a specific string n.


1 Answers

You could define a function for the action required by each regex and do something like

def dostuff():     stuff  def dootherstuff():     otherstuff  def doevenmorestuff():     evenmorestuff  actions = ((regex1, dostuff), (regex2, dootherstuff), (regex3, doevenmorestuff))  for regex, action in actions:     m = re.match(regex, line)     if m:          action()         break 
like image 156
Tim Pietzcker Avatar answered Oct 08 '22 20:10

Tim Pietzcker