Is there a way in Python to access match groups without explicitly creating a match object (or another way to beautify the example below)?
Here is an example to clarify my motivation for the question:
Following Perl code
if ($statement =~ /I love (\w+)/) { print "He loves $1\n"; } elsif ($statement =~ /Ich liebe (\w+)/) { print "Er liebt $1\n"; } elsif ($statement =~ /Je t\'aime (\w+)/) { print "Il aime $1\n"; }
translated into Python
m = re.search("I love (\w+)", statement) if m: print "He loves",m.group(1) else: m = re.search("Ich liebe (\w+)", statement) if m: print "Er liebt",m.group(1) else: m = re.search("Je t'aime (\w+)", statement) if m: print "Il aime",m.group(1)
looks very awkward (if-else-cascade, match object creation).
Match objects in Python regex Match objects contain information about a particular regex match — the position in the string where the match was found, the contents of any capture groups for the match, and so on. You can work with match objects using these methods: match. group() returns the match from the string.
What is Group in Regex? A group is a part of a regex pattern enclosed in parentheses () metacharacter. We create a group by placing the regex pattern inside the set of parentheses ( and ) . For example, the regular expression (cat) creates a single group containing the letters 'c', 'a', and 't'.
Capturing groups are a way to treat multiple characters as a single unit. They are created by placing the characters to be grouped inside a set of parentheses. For example, the regular expression (dog) creates a single group containing the letters "d", "o", and "g".
<name>group) or (?' name'group) captures the match of group into the backreference “name”. The named backreference is \k<name> or \k'name'. Compared with Python, there is no P in the syntax for named groups.
You could create a little class that returns the boolean result of calling match, and retains the matched groups for subsequent retrieval:
import re class REMatcher(object): def __init__(self, matchstring): self.matchstring = matchstring def match(self,regexp): self.rematch = re.match(regexp, self.matchstring) return bool(self.rematch) def group(self,i): return self.rematch.group(i) for statement in ("I love Mary", "Ich liebe Margot", "Je t'aime Marie", "Te amo Maria"): m = REMatcher(statement) if m.match(r"I love (\w+)"): print "He loves",m.group(1) elif m.match(r"Ich liebe (\w+)"): print "Er liebt",m.group(1) elif m.match(r"Je t'aime (\w+)"): print "Il aime",m.group(1) else: print "???"
Update for Python 3 print as a function, and Python 3.8 assignment expressions - no need for a REMatcher class now:
import re for statement in ("I love Mary", "Ich liebe Margot", "Je t'aime Marie", "Te amo Maria"): if m := re.match(r"I love (\w+)", statement): print("He loves", m.group(1)) elif m := re.match(r"Ich liebe (\w+)", statement): print("Er liebt", m.group(1)) elif m := re.match(r"Je t'aime (\w+)", statement): print("Il aime", m.group(1)) else: print()
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With