Split by regex without resulting empty strings in Python [duplicate]

Question

I want to split a string containing irregularly repeating delimiter, like method split() does:

>>> ' a b   c  de  '.split()
['a', 'b', 'c', 'de']

However, when I apply split by regular expression, the result is different (empty strings sneak into the resulting list):

>>> re.split('\s+', ' a b   c  de  ')
['', 'a', 'b', 'c', 'de', '']
>>> re.split('\.+', '.a.b...c..de..')
['', 'a', 'b', 'c', 'de', '']

And what I want to see:

>>>some_smart_split_method('.a.b...c..de..')
['a', 'b', 'c', 'de']

Walker · Accepted Answer

The empty strings are just an inevitable result of the regex split (though there is good reasoning as to why that behavior might be desireable). To get rid of them you can call filter on the result.

results = re.split(...)
results = list(filter(None, results))

Note the list() transform is only necessary in Python 3 -- in Python 2 filter() returns a list, while in 3 it returns a filter object.

dlask · Answer

>>> re.findall(r'\S+', ' a b   c  de  ')
['a', 'b', 'c', 'de']

Split by regex without resulting empty strings in Python [duplicate]

Tags:

python

regex

Roman

2 Answers

Walker

dlask

Recent Activity

Donate For Us

Split by regex without resulting empty strings in Python [duplicate]

Tags:

python

regex

Roman

2 Answers

Walker

dlask

Related questions

Recent Activity

Donate For Us