Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Split string without losing delimiter (and its count)

I'm trying to split a string like below on spaces:

string = "This            is a                      test."

# desired output
# ['This', '            ', 'is', ' ', 'a', '                      ', 'test.']

# actual output, which does make sense
result = string.split()
# ['This', 'is', 'a', 'test.']

There's also re.split which keeps the delimiter, but not in the way I hoped:

import re
string = "This            is a                      test."

result = re.split(r"( )", string)
# ['This',
# ' ', '', ' ', '', ' ', '', ' ', '', ' ', '', ' ', '', ' ', '', ' ', '', ' ', '', ' ', '', ' ', '', ' ',
# 'is', ' ',
# 'a', ' ', '', ' ', '', ' ', '', ' ', '', ' ', '', ' ', '', ' ', '', ' ', '', ' ', '', ' ', '', ' ', '', ' ', '', ' ', '', ' ', '', ' ', '', ' ', '', ' ', '', ' ', '', ' ', '', ' ', '', ' ', '', ' ',
# 'test.']

I can do something like this and achieve the result I want:

string = "This            is a                      test."
result = []

spaces = ''
word = ''
for letter in string:
    if letter == ' ':
        spaces += ' ' 
        if word:
            result.append(word)
            word = ''
    else:
        word += letter
        if spaces:
            result.append(spaces)
            spaces = ''
if spaces:
    result.append(spaces)
if word:
    result.append(word)

print(result)
# ['This', '            ', 'is', ' ', 'a', '                      ', 'test.']

But this doesn't feel like the best way to do it. Is there a more Pythonic way of achieving this?

like image 468
Amir Shabani Avatar asked Dec 31 '22 13:12

Amir Shabani


1 Answers

Try with re.split with the expression of (\s+):

>>> import re
>>> string = "This            is a                      test."
>>> re.split(r'(\s+)', string)
['This', '            ', 'is', ' ', 'a', '                      ', 'test.']
>>> 

Regex101 example.

like image 108
U12-Forward Avatar answered Jan 12 '23 19:01

U12-Forward