Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

python regular expression replace two situations with one command

I want to replace string like

'''1  2  3  4  5  6 abcde fghij klmno pqrst 7 8 9 10 uvwxyz abcdef 11 12 13'''

to

'''1  2  3  4  5  6
abcde fghij klmno pqrst
7 8 9 10
uvwxyz abcdef
11 12 13'''

that is my method:

s = re.sub(r'(\d) ([a-z])', r'\1\n\2', s)
s = re.sub(r'([a-z]) (\d)', r'\1\n\2', s)

how can I do this in one regular expression? and I know I can do it use re.findall and groups but I want to find a more easy way?

like image 897
WeizhongTu Avatar asked Jun 17 '26 14:06

WeizhongTu


2 Answers

I really think the easiest way would be to match using findall instead of splitting or sub-ing:

result = re.findall(r"\d+(?:\s+\d+)*|[a-z]+(?:\s+[a-z]+)*", text)
print('\n'.join(result))

or in one line:

result = '\n'.join(re.findall(r"\d+(?:\s+\d+)*|[a-z]+(?:\s+[a-z]+)*", text))

Gives:

1  2  3  4  5  6
abcde fghij klmno pqrst
7 8 9 10
uvwxyz abcdef
11 12 13

\d+(?:\s+\d+)* matches the parts with digits and spaces.

[a-z]+(?:\s+[a-z]+)* matches the parts with letters and spaces.

like image 78
Jerry Avatar answered Jun 20 '26 04:06

Jerry


Here are two ways to do it with a single regex:

  • Use a conditional pattern. Capture \1 is straightforward. Capture \4 checks whether we grabbed \2 or \3, and then defines the rest of the pattern accordingly.

    re.sub(r'((\d)|([a-z])) ((?(2)[a-z]|\d))', r'\1\n\4', s)
    
  • Replace only the space, and surround it with look-behind and look-ahead assertions.

    re.sub(r'(?<=\d) (?=[a-z])|(?<=[a-z]) (?=\d)', '\n', s)
    

But your two simple regexes are better than all of this nonsense.

like image 27
FMc Avatar answered Jun 20 '26 05:06

FMc



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!