Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

python group(0) meaning

Tags:

python

regex

What is the exact definition of group(0) in re.search?

Sometimes the search can get complex and I would like to know what is the supposed group(0) value by definition?

Just to give an example of where the confusion comes, consider this matching. The printed result is only def. So in this case group(0) didn't return the entire match.

 m = re.search('(?<=abc)def', 'abcdef')
>>> m.group(0)
def
like image 797
apadana Avatar asked Apr 19 '16 20:04

apadana


People also ask

What is group () in Python?

groups() method. This method returns a tuple containing all the subgroups of the match, from 1 up to however many groups are in the pattern. The default argument is used for groups that did not participate in the match; it defaults to None. In later versions (from 1.5.

What does group 1 do in Python?

group(1) represents the first parenthesised subgroup.

How do you use groups in Python?

A group is a part of a regex pattern enclosed in parentheses () metacharacter. We create a group by placing the regex pattern inside the set of parentheses ( and ) . For example, the regular expression (cat) creates a single group containing the letters 'c', 'a', and 't'.

What is re match group?

re.MatchObject.group() method returns the complete matched subgroup by default or a tuple of matched subgroups depending on the number of arguments. Syntax: re.MatchObject.group([group]) Parameter: group: (optional) group defaults to zero (meaning that it it will return the complete matched string).


2 Answers

match_object.group(0) says that the whole part of match_object is chosen.

In addition group(0) can be be explained by comparing it with group(1), group(2), group(3), ..., group(n). Group(0) locates the whole match expression. Then to determine more matching locations paranthesis are used: group(1) means the first paranthesis pair locates matching expression 1, group(2) says the second next paranthesis pair locates the match expression 2, and so on. In each case the opening bracket determines the next paranthesis pair by using the furthest closing bracket to form a paranthesis pair. This probably sounds confusing, that's why there is an example below.

But you need to differentiate between the syntax of the paranthesis of '(?<=abc)'. These paranthesis have a different syntactical meaning, which is to locate what is bound by '?<='. So your main problem is that you don't know what '?<=' does. This is a so called look-behind which means that it matches the part behind the expression that it bounds.

In the following example 'abc' is bound by the look-behind.

No paranthesis are needed to form match group 0 since it locates the whole match object anyway.

The opening bracket in front of the letter 'd' takes the last closing bracket in front of the letter 'f' to form matching group 1.

The brackets that are around the letter 'e' define matching group 2.

import re

m = re.search('(?<=abc)(d(e))f', 'abcdef')

print(m.group(0))
print(m.group(1))
print(m.group(2))

This prints:

def
de
e
like image 55
manuel_va Avatar answered Sep 18 '22 17:09

manuel_va


group(0) returns the full string matched by the regex. It's just that abc isn't part of the match. (?<=abc) doesn't match abc - it matches any position in the string immediately preceded by abc.

like image 40
user2357112 supports Monica Avatar answered Sep 20 '22 17:09

user2357112 supports Monica