Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

python regex - findall not returning output as expected

I am having trouble understanding findall, which says...

Return all non-overlapping matches of pattern in string, as a list of strings. The string is scanned left-to-right, and matches are returned in the order found. If one or more groups are present in the pattern, return a list of groups; this will be a list of tuples if the pattern has more than one group. Empty matches are included in the result.

Why doesn't this basic IP regex work with findall as expected? The matches are not overlapping, and regexpal confirms that pattern is highlighted in re_str.

enter image description here

Expected: ['1.2.2.3', '123.345.34.3']

Actual: ['2.', '34.']

re_str = r'(\d{1,3}\.){3}\d{1,3}'
line = 'blahblah -- 1.2.2.3 blah 123.345.34.3'
matches = re.findall(re_str, line)
print(matches)    # ['2.', '34.']
like image 671
chocalaca Avatar asked Dec 01 '25 20:12

chocalaca


2 Answers

When you use parentheses in your regex, re.findall() will return only the parenthesized groups, not the entire matched string. Put a ?: after the ( to tell it not to use the parentheses to extract a group, and then the results should be the entire matched string.

like image 147
TallChuck Avatar answered Dec 03 '25 08:12

TallChuck


This is because capturing groups return only the last match if they're repeated.

Instead, you should make the repeating group non-capturing, and use a non-repeated capture at an outer layer:

re_str = r'((?:\d{1,3}\.){3}\d{1,3})'

Note that for findall, if there is no capturing group, the whole match is automatically selected (like \0), so you could drop the outer capture:

re_str = r'(?:\d{1,3}\.){3}\d{1,3}'
like image 30
iBug Avatar answered Dec 03 '25 10:12

iBug