Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What does "?:" mean in a Python regular expression?

Tags:

python

regex

Below is the Python regular expression. What does the ?: mean in it? What does the expression do overall? How does it match a MAC address such as "00:07:32:12:ac:de:ef"?

re.compile(([\dA-Fa-f]{2}(?:[:-][\dA-Fa-f]{2}){5}), string)  
like image 537
Hari Avatar asked May 29 '12 05:05

Hari


1 Answers

It (?:...) means a set of non-capturing grouping parentheses.

Normally, when you write (...) in a regex, it 'captures' the matched material. When you use the non-capturing version, it doesn't capture.

You can get at the various parts matched by the regex using the methods in the re package after the regex matches against a particular string.


How does this regular expression match MAC address "00:07:32:12:ac:de:ef"?

That's a different question from what you initially asked. However, the regex part is:

([\dA-Fa-f]{2}(?:[:-][\dA-Fa-f]{2}){5})

The outer most pair of parentheses are capturing parentheses; what they surround will be available when you use the regex against a string successfully.

The [\dA-Fa-f]{2} part matches a digit (\d) or the hexadecimal digits A-Fa-f], in a pair {2}, followed by a non-capturing grouping where the matched material is a colon or dash (: or -), followed by another pair of hex digits, with the whole repeated exactly 5 times.

p = re.compile(([\dA-Fa-f]{2}(?:[:-][\dA-Fa-f]{2}){5}))
m = p.match("00:07:32:12:ac:de:ef")
if m:
    m.group(1)

The last line should print the string "00:07:32:12:ac:de" because that is the first set of 6 pairs of hex digits (out of the seven pairs in total in the string). In fact, the outer grouping parentheses are redundant and if omitted, m.group(0) would work (it works even with them). If you need to match 7 pairs, then you change the 5 into a 6. If you need to reject them, then you'd put anchors into the regex:

p = re.compile(^([\dA-Fa-f]{2}(?:[:-][\dA-Fa-f]{2}){5})$)

The caret ^ matches the start of string; the dollar $ matches the end of string. With the 5, that would not match your sample string. With 6 in place of 5, it would match your string.

like image 103
Jonathan Leffler Avatar answered Sep 24 '22 01:09

Jonathan Leffler