I want to split a string containing irregularly repeating delimiter, like method split()
does:
>>> ' a b c de '.split()
['a', 'b', 'c', 'de']
However, when I apply split by regular expression, the result is different (empty strings sneak into the resulting list):
>>> re.split('\s+', ' a b c de ')
['', 'a', 'b', 'c', 'de', '']
>>> re.split('\.+', '.a.b...c..de..')
['', 'a', 'b', 'c', 'de', '']
And what I want to see:
>>>some_smart_split_method('.a.b...c..de..')
['a', 'b', 'c', 'de']
The empty strings are just an inevitable result of the regex split (though there is good reasoning as to why that behavior might be desireable). To get rid of them you can call filter on the result.
results = re.split(...)
results = list(filter(None, results))
Note the list() transform is only necessary in Python 3 -- in Python 2 filter() returns a list, while in 3 it returns a filter object.
>>> re.findall(r'\S+', ' a b c de ')
['a', 'b', 'c', 'de']
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With