Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pythonic list comprehension possible with this loop?

I have a love/hate relationship with list comprehension. On the one hand I think they are neat and elegant. On the other hand I hate reading them. (especially ones I didn't write) I generally follow the rule of, make it readable until speed is required. So my question is really academic at this point.

I want a list of stations from a table who's strings often have extra spaces. I need those spaces stripped out. Sometimes those stations are blank and should not be included.

stations = []
for row in data:
    if row.strip():
        stations.append(row.strip())

Which translates to this list comprehension:

stations = [row.strip() for row in data if row.strip()]

This works well enough, but it occurs to me that I'm doing strip twice. I guessed that .strip() was not really needed twice and is generally slower than just assigning a variable.

stations = []
for row in data:
    blah = row.strip()
    if blah:
        stations.append(blah)

Turns out I was correct.

> Striptwice list comp 14.5714301669     
> Striptwice loop 17.9919670399
> Striponce loop 13.0950567955

Timeit shows between the two loop segments, the 2nd (strip once) is faster. No real surprise here. I am surprised that list comprehension is only marginally slower even though it's doing a strip twice.

My question: Is there a way to write a list comprehension that only does the strip once?



Results:

Here are the timing results of the suggestions

# @JonClements & @ErikAllik
> Striptonce list comp 10.7998494348
# @adhie
> Mapmethod loop 14.4501044569
like image 573
Marcel Wilson Avatar asked Oct 04 '13 15:10

Marcel Wilson


2 Answers

There is - create a generator of the stripped strings first, then use that:

stations = [row for row in (row.strip() for row in data) if row]

You could also write it without a comp, eg (swap to imap and remove list for Python 2.x):

stations = list(filter(None, map(str.strip, data)))
like image 101
Jon Clements Avatar answered Sep 29 '22 05:09

Jon Clements


Nested comprehensions can be tricky to read, so my first preference would be:

stripped = (x.strip() for x in data)
stations = [x for x in stripped if x]

Or, if you inline stripped, you get a single (nested) list comprehension:

stations = [x for x in (x.strip() for x in data) if x]

Note that the first/inner comprehension is a actually generator expression, which, in other words is a lazy list comprehension; this is to avoid iterating twice.

like image 20
Erik Kaplun Avatar answered Sep 29 '22 06:09

Erik Kaplun