Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Match groups in Python

Tags:

python

regex

Is there a way in Python to access match groups without explicitly creating a match object (or another way to beautify the example below)?

Here is an example to clarify my motivation for the question:

Following Perl code

if    ($statement =~ /I love (\w+)/) {   print "He loves $1\n"; } elsif ($statement =~ /Ich liebe (\w+)/) {   print "Er liebt $1\n"; } elsif ($statement =~ /Je t\'aime (\w+)/) {   print "Il aime $1\n"; } 

translated into Python

m = re.search("I love (\w+)", statement) if m:   print "He loves",m.group(1) else:   m = re.search("Ich liebe (\w+)", statement)   if m:     print "Er liebt",m.group(1)   else:     m = re.search("Je t'aime (\w+)", statement)     if m:       print "Il aime",m.group(1) 

looks very awkward (if-else-cascade, match object creation).

like image 440
Curd Avatar asked Mar 31 '10 15:03

Curd


People also ask

What does match group do in Python?

Match objects in Python regex Match objects contain information about a particular regex match — the position in the string where the match was found, the contents of any capture groups for the match, and so on. You can work with match objects using these methods: match. group() returns the match from the string.

What are groups in regex Python?

What is Group in Regex? A group is a part of a regex pattern enclosed in parentheses () metacharacter. We create a group by placing the regex pattern inside the set of parentheses ( and ) . For example, the regular expression (cat) creates a single group containing the letters 'c', 'a', and 't'.

How do I match a group in regex?

Capturing groups are a way to treat multiple characters as a single unit. They are created by placing the characters to be grouped inside a set of parentheses. For example, the regular expression (dog) creates a single group containing the letters "d", "o", and "g".

How do you name a grouped match in Python?

<name>group) or (?' name'group) captures the match of group into the backreference “name”. The named backreference is \k<name> or \k'name'. Compared with Python, there is no P in the syntax for named groups.


1 Answers

You could create a little class that returns the boolean result of calling match, and retains the matched groups for subsequent retrieval:

import re  class REMatcher(object):     def __init__(self, matchstring):         self.matchstring = matchstring      def match(self,regexp):         self.rematch = re.match(regexp, self.matchstring)         return bool(self.rematch)      def group(self,i):         return self.rematch.group(i)   for statement in ("I love Mary",                    "Ich liebe Margot",                    "Je t'aime Marie",                    "Te amo Maria"):      m = REMatcher(statement)      if m.match(r"I love (\w+)"):          print "He loves",m.group(1)       elif m.match(r"Ich liebe (\w+)"):         print "Er liebt",m.group(1)       elif m.match(r"Je t'aime (\w+)"):         print "Il aime",m.group(1)       else:          print "???" 

Update for Python 3 print as a function, and Python 3.8 assignment expressions - no need for a REMatcher class now:

import re  for statement in ("I love Mary",                   "Ich liebe Margot",                   "Je t'aime Marie",                   "Te amo Maria"):      if m := re.match(r"I love (\w+)", statement):         print("He loves", m.group(1))      elif m := re.match(r"Ich liebe (\w+)", statement):         print("Er liebt", m.group(1))      elif m := re.match(r"Je t'aime (\w+)", statement):         print("Il aime", m.group(1))      else:         print() 
like image 96
PaulMcG Avatar answered Sep 19 '22 04:09

PaulMcG