Why does re.findall return a list of tuples when my pattern only contains one group?

Question

Say I have a string s containing letters and two delimiters 1 and 2. I want to split the string in the following way:

if a substring t falls between 1 and 2, return t
otherwise, return each character

So if s = 'ab1cd2efg1hij2k', the expected output is ['a', 'b', 'cd', 'e', 'f', 'g', 'hij', 'k'].

I tried to use regular expressions:

import re
s = 'ab1cd2efg1hij2k'
re.findall( r'(1([a-z]+)2|[a-z])', s )

[('a', ''),
 ('b', ''),
 ('1cd2', 'cd'),
 ('e', ''),
 ('f', ''),
 ('g', ''),
 ('1hij2', 'hij'),
 ('k', '')]

From there i can do [ x[x[-1]!=''] for x in re.findall( r'(1([a-z]+)2|[a-z])', s ) ] to get my answer, but I still don't understand the output. The documentation says that findall returns a list of tuples if the pattern has more than one group. However, my pattern only contains one group. Any explanation is welcome.

Sebastian N · Accepted Answer

If you want to have an 'or' match without having the split into match groups just add a '?:' to the beginning of the 'or' match.

Without '?:'

re.findall('(test (word1|word2))', 'test word1')

Output:
[('test word1', 'word1')]

With '?:'

re.findall('(test (?:word1|word2))', 'test word1')

Output:
['test word1']

Further explanation: https://www.ocpsoft.org/tutorials/regular-expressions/or-in-regex/

Greem666 · Answer

I am 5 years too late to the party, but I think I might have found an elegant solution to the re.findall() ugly tuple-ridden output with multiple capture groups.

In general, if you end up with an output which looks something like that:

[('pattern_1', '', ''), ('', 'pattern_2', ''), ('pattern_1', '', ''), ('', '', 'pattern_3')]

Then you can bring it into a flat list with this little trick:

["".join(x) for x in re.findall(all_patterns, iterable)]

The expected output will be like so:

['pattern_1', 'pattern_2', 'pattern_1', 'pattern_3']

It was tested on Python 3.7. Hope it helps!

Why does re.findall return a list of tuples when my pattern only contains one group?

Tags:

python

regex

findall

usual me

2 Answers

Sebastian N

Greem666

Recent Activity

Donate For Us

Why does re.findall return a list of tuples when my pattern only contains one group?

Tags:

python

regex

findall

usual me

2 Answers

Sebastian N

Greem666

Related questions

Recent Activity

Donate For Us