Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Find out number of capture groups in Python regular expressions

Tags:

python

regex

Is there a way to determine how many capture groups there are in a given regular expression?

I would like to be able to do the follwing:

def groups(regexp, s):     """ Returns the first result of re.findall, or an empty default      >>> groups(r'(\d)(\d)(\d)', '123')     ('1', '2', '3')     >>> groups(r'(\d)(\d)(\d)', 'abc')     ('', '', '')     """     import re     m = re.search(regexp, s)     if m:         return m.groups()     return ('',) * num_of_groups(regexp) 

This allows me to do stuff like:

first, last, phone = groups(r'(\w+) (\w+) ([\d\-]+)', 'John Doe 555-3456') 

However, I don't know how to implement num_of_groups. (Currently I just work around it.)

EDIT: Following the advice from rslite, I replaced re.findall with re.search.

sre_parse seems like the most robust and comprehensive solution, but requires tree traversal and appears to be a bit heavy.

MizardX's regular expression seems to cover all bases, so I'm going to go with that.

like image 347
itsadok Avatar asked Sep 24 '08 13:09

itsadok


People also ask

What is capturing group in regex Python?

Capturing groups are a handy feature of regular expression matching that allows us to query the Match object to find out the part of the string that matched against a particular part of the regular expression. Anything you have in parentheses () will be a capture group.

What are capturing groups in regex?

Capturing groups are a way to treat multiple characters as a single unit. They are created by placing the characters to be grouped inside a set of parentheses. For example, the regular expression (dog) creates a single group containing the letters "d" "o" and "g" .

What is group () in Python?

groups() method. This method returns a tuple containing all the subgroups of the match, from 1 up to however many groups are in the pattern. The default argument is used for groups that did not participate in the match; it defaults to None. In later versions (from 1.5.


1 Answers

def num_groups(regex):     return re.compile(regex).groups 
like image 199
Markus Jarderot Avatar answered Oct 01 '22 14:10

Markus Jarderot