Get the regex match and the rest (none-match) from Python's re module

Question

Does the re module of Python3 offer an in-build way to get the match and the rest (none-match) back?

Here is a simple example:

>>> import re
>>> p = r'\d'
>>> s = '1a'
>>> re.findall(p, s)
['1']

The result I want is something like ['1', 'a'] or [['1'], ['a']] or something else where I can differentiate between match and rest.

Of course can subtract the resulting (matching) string from the original one to get the rest. But is there an in build way for this?

I do not set the regex tag here because the question is less related to RegEx itself but more to a feature of a Python package.

Nilton Moura · Accepted Answer

You can match everything and create groups to "split" between the important part from the rest:

>>> import re
>>> p = r'(\d+)(.*)'
>>> s = '12a
34b
cde'
>>> re.findall(p, s)
[('12', 'a'), ('34', 'b')]

re.findall documentation

gremur · Answer

Possible solution is the following:

import re

string = '1a'
re_pattern = r'^(\d+)(.*)'

result = re.findall(re_pattern, string)
print(result)

Returns list of tuples

[('1', 'a')]

or if you like to return list of str items

result = [item for t in re.findall(re_pattern, string) for item in t]
print(result)

Returns

['1', 'a']

Explanations to the code:

re_pattern = r'(\d+)(.*)' is looking for two groups: 1st group (\d+) means digits one or more, 2nd group (.*) means the rest of the string.
re.findall(re_pattern, string) returns list of tuple like [('1', 'a')]
list comprehension converts list of tuples to list of string items

Donate For Us