I want to replace string like
'''1 2 3 4 5 6 abcde fghij klmno pqrst 7 8 9 10 uvwxyz abcdef 11 12 13'''
to
'''1 2 3 4 5 6
abcde fghij klmno pqrst
7 8 9 10
uvwxyz abcdef
11 12 13'''
that is my method:
s = re.sub(r'(\d) ([a-z])', r'\1\n\2', s)
s = re.sub(r'([a-z]) (\d)', r'\1\n\2', s)
how can I do this in one regular expression? and I know I can do it use re.findall and groups but I want to find a more easy way?
I really think the easiest way would be to match using findall instead of splitting or sub-ing:
result = re.findall(r"\d+(?:\s+\d+)*|[a-z]+(?:\s+[a-z]+)*", text)
print('\n'.join(result))
or in one line:
result = '\n'.join(re.findall(r"\d+(?:\s+\d+)*|[a-z]+(?:\s+[a-z]+)*", text))
Gives:
1 2 3 4 5 6
abcde fghij klmno pqrst
7 8 9 10
uvwxyz abcdef
11 12 13
\d+(?:\s+\d+)* matches the parts with digits and spaces.
[a-z]+(?:\s+[a-z]+)* matches the parts with letters and spaces.
Here are two ways to do it with a single regex:
Use a conditional pattern. Capture \1 is straightforward. Capture \4 checks whether we grabbed \2 or \3, and then defines the rest of the pattern accordingly.
re.sub(r'((\d)|([a-z])) ((?(2)[a-z]|\d))', r'\1\n\4', s)
Replace only the space, and surround it with look-behind and look-ahead assertions.
re.sub(r'(?<=\d) (?=[a-z])|(?<=[a-z]) (?=\d)', '\n', s)
But your two simple regexes are better than all of this nonsense.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With