Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

python split without creating blanks [duplicate]

I understand why it is important to create blanks using split thanks to this question, but sometimes it is necessary not to grab them.

lets say you parsed some css and got the following strings:

s1 = 'background-color:#000;color:#fff;border:1px #ccc dotted;'
s2 = 'color:#000;background-color:#fff;border:1px #333 dotted'

both are valid css even though there is a semicolon lacking at the end of the string. when splitting the strings, you get the following:

>>> s1.split(';')
['background-color:#000', 'color:#fff', 'border:1px #ccc dotted', '']
>>> s2.split(';')
['color:#000', 'background-color:#fff', 'border:1px #333 dotted']

that extra semicolon creates a blank item in the list. now if I want to manipulate further I would need to test the beginning and end of each list, and remove them if they are blank, which is not that bad, but seems avoidable.

question:

is there a method that is essentially the same as split but does not include trailing blank items? or is there simply a way to remove those just like a string has strip to remove the trailing whitespace

like image 742
Ryan Saxe Avatar asked Oct 14 '25 08:10

Ryan Saxe


2 Answers

Simply remove the items with the None filter:

filter(None, s1.split(';'))

Demo:

>>> s1 = 'background-color:#000;color:#fff;border:1px #ccc dotted;'
>>> filter(None, s1.split(';'))
['background-color:#000', 'color:#fff', 'border:1px #ccc dotted']

Calling filter() with None removes all 'empty' or numeric 0 items; anything that would evaluate to false in a boolean context.

filter(None, ....) eats list comprehensions for breakfast:

>>> import timeit
>>> timeit.timeit('filter(None, a)', "a = [1, 2, 3, None, 4, 'five', ''] * 100")
9.410392045974731
>>> timeit.timeit('[i for i in a if i]', "a = [1, 2, 3, None, 4, 'five', ''] * 100")
44.9318630695343
like image 122
Martijn Pieters Avatar answered Oct 16 '25 22:10

Martijn Pieters


You can use a list comprehension to filter out the empty strings, as an empty string is considered False:

>>> s1 = 'background-color:#000;color:#fff;border:1px #ccc dotted;'
>>> [i for i in s1.split(';') if i]
['background-color:#000', 'color:#fff', 'border:1px #ccc dotted']

Alternatively, you can rstrip() the semicolon first:

>>> s1.rstrip(';').split(';')
['background-color:#000', 'color:#fff', 'border:1px #ccc dotted']
like image 26
TerryA Avatar answered Oct 16 '25 22:10

TerryA