Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How does Python Re Module work in this examle?

Tags:

python

regex

What is the process of matching this regular expression? I don't get why the explicit group is 'c'. This is piece of code is taken from Python Re Module Doc.

>>> m = re.match("([abc])+", "abc")
>>> m.group()
'abc'
>>> m.groups()
('c',)

Also, what about:

>>> m = re.match("([abc]+)", "abc")
>>> m.group()
'abc'
>>> m.groups()
('abc',)

And:

>>> m = re.match("([abc])", "abc")
>>> m.group()
'a'
>>> m.groups()
('a',)

Thanks.

like image 514
knd Avatar asked Jan 13 '23 11:01

knd


2 Answers

re.match("([abc])+", "abc")

Matches a group consisting of a, b or c. The group at the end of that is the last character found in the character class as matching is greedy so, ends up with the last matching character which is c.

m = re.match("([abc]+)", "abc")

Matches a group that contains one or more consecutive occurences of a, b or c. The matching group at the end is the largest contingious group of a, b or c.

re.match("([abc])", "abc")

Matches either a, b or c. The match group will always be the first matching character at the start of the string.

like image 189
Jon Clements Avatar answered Jan 16 '23 18:01

Jon Clements


In your first example, ([abc])+ creates a group for each a, b, or c character it finds. c is the explicit group because it's the last character that the regex matches:

>>> re.match("([abc])+", "abca").groups()
('a',)

In your second example, you're creating one group that matches one or more a's, b's, or c's in a row. Thus, you create one group for abc. If we extend abc, the group will extend with the string:

>>> re.match("([abc]+)", "abca").groups()
('abca',)

In your third example, the regex is searching for exactly one character that is either an a, a b, or a c. Since a is the first character in abc, you get an a. This changes if we change the first character in the string:

>>> re.match("([abc])", "cba").group()
'c'
like image 23
Nolen Royalty Avatar answered Jan 16 '23 17:01

Nolen Royalty