Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

python re ?: example [duplicate]

Tags:

python

regex

i saw a regular expression (?= (?:\d{5}|[A-Z]{2})) in a python re example, and was very confused about the meaning of the ?: .

I also see the python doc, there is the explain:

(?:...)

A non-capturing version of regular parentheses. Matches whatever regular expression is inside the parentheses, but the substring matched by the group cannot be retrieved after performing a match or referenced later in the pattern.

who can give me an example, and explain why it works, thanks!!

like image 644
Thompson Avatar asked Apr 10 '14 13:04

Thompson


People also ask

What does ?: Mean in Python regex?

Python docs: (?:...) A non-capturing version of regular parentheses. Matches whatever regular expression is inside the parentheses, but the substring matched by the group cannot be retrieved after performing a match or referenced later in the pattern.

What is re Ignorecase in Python?

re. IGNORECASE : This flag allows for case-insensitive matching of the Regular Expression with the given string i.e. expressions like [A-Z] will match lowercase letters, too. Generally, It's passed as an optional argument to re. compile() .

How do you use re match?

match() function of re in Python will search the regular expression pattern and return the first occurrence. The Python RegEx Match method checks for a match only at the beginning of the string. So, if a match is found in the first line, it returns the match object.


2 Answers

Ordinarily, parentheses create a "capturing" group inside your regex:

regex = re.compile("(set|let) var = (\\w+|\\d+)")
print regex.match("set var = 12").groups()

results

('set', '12')

Later you can retrieve those groups by calling .groups() method on the result of a match. As you see whatever is inside parentheses is captured in "groups." But you might not care about all those groups. Say you only want to find what's in the second group and not the first. You need the first set of parentheses in order to group "get" and "set" but you can turn off capturing by putting "?:" at the beginning:

regex = re.compile("(?:set|let) var = (\\w+|\\d+)")
print regex.match("set var = 12").groups()

results:

('12',)
like image 126
Elektito Avatar answered Oct 01 '22 14:10

Elektito


If you do not need the group to capture its match, you can optimize this regular expression into Set(?:Value)?. The question mark and the colon after the opening parenthesis are the syntax that creates a non-capturing group. The question mark after the opening bracket is unrelated to the question mark at the end of the regex. The final question mark is the quantifier that makes the previous token optional. This quantifier cannot appear after an opening parenthesis, because there is nothing to be made optional at the start of a group. Therefore, there is no ambiguity between the question mark as an operator to make a token optional and the question mark as part of the syntax for non-capturing groups, even though this may be confusing at first. There are other kinds of groups that use the (? syntax in combination with other characters than the colon that are explained later in this tutorial.

color=(?:red|green|blue) is another regex with a non-capturing group. This regex has no quantifiers.

From : http://www.regular-expressions.info/brackets.html

Also read: What is a non-capturing group? What does a question mark followed by a colon (?:) mean?

like image 33
Martyn Avatar answered Oct 01 '22 15:10

Martyn