I'm trying to split a string like below on spaces:
string = "This is a test."
# desired output
# ['This', ' ', 'is', ' ', 'a', ' ', 'test.']
# actual output, which does make sense
result = string.split()
# ['This', 'is', 'a', 'test.']
There's also re.split
which keeps the delimiter, but not in the way I hoped:
import re
string = "This is a test."
result = re.split(r"( )", string)
# ['This',
# ' ', '', ' ', '', ' ', '', ' ', '', ' ', '', ' ', '', ' ', '', ' ', '', ' ', '', ' ', '', ' ', '', ' ',
# 'is', ' ',
# 'a', ' ', '', ' ', '', ' ', '', ' ', '', ' ', '', ' ', '', ' ', '', ' ', '', ' ', '', ' ', '', ' ', '', ' ', '', ' ', '', ' ', '', ' ', '', ' ', '', ' ', '', ' ', '', ' ', '', ' ', '', ' ', '', ' ',
# 'test.']
I can do something like this and achieve the result I want:
string = "This is a test."
result = []
spaces = ''
word = ''
for letter in string:
if letter == ' ':
spaces += ' '
if word:
result.append(word)
word = ''
else:
word += letter
if spaces:
result.append(spaces)
spaces = ''
if spaces:
result.append(spaces)
if word:
result.append(word)
print(result)
# ['This', ' ', 'is', ' ', 'a', ' ', 'test.']
But this doesn't feel like the best way to do it. Is there a more Pythonic way of achieving this?
Try with re.split
with the expression of (\s+)
:
>>> import re
>>> string = "This is a test."
>>> re.split(r'(\s+)', string)
['This', ' ', 'is', ' ', 'a', ' ', 'test.']
>>>
Regex101 example.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With