Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

python set comprehension for 2.6

I was trying set comprehension for 2.6, and came across the following two ways. I thought the first method would be faster than the second, timeit suggested otherwise. Why is the second method faster even though the second method has got an extra list instantiation followed by a set instantiation?

Method 1:

In [16]: %timeit set(node[0] for node in pwnodes if node[1].get('pm'))
1000000 loops, best of 3: 568 ns per loop

Method 2:

In [17]: %timeit set([node[0] for node in pwnodes if node[1].get('pm')]) 
1000000 loops, best of 3: 469 ns per loop

where pwnodes = [('e1', dict(pm=1, wired=1)), ('e2', dict(pm=1, wired=1))].

like image 479
Anandan Avatar asked Jul 07 '15 14:07

Anandan


2 Answers

Iteration is simply faster when using a list comprehension:

In [23]: from collections import deque

In [24]: %timeit deque((node[0] for node in pwnodes if node[1].get('pm')), maxlen=0)
1000 loops, best of 3: 305 µs per loop

In [25]: %timeit deque([node[0] for node in pwnodes if node[1].get('pm')], maxlen=0)
1000 loops, best of 3: 246 µs per loop

The deque is used to illustrate iteration speed; a deque with maxlen set to 0 discards all elements taken from the iterable so there are no memory allocation differences to skew the results.

That's because in Python 2, list comprehensions don't use a separate namespace, while a generator expression does (it has to, by necessity). That extra namespace requires a new frame on the stack, and this is expensive. The major advantage of generator expressions is their low memory footprint, not their speed.

In Python 3, list comprehensions have a separate namespace as well, and list comprehension and generator iteration speed is comparable. You also have set comprehensions, which are fastest still, even on Python 2.

like image 56
Martijn Pieters Avatar answered Oct 05 '22 22:10

Martijn Pieters


My guess is because the second one involves a generator and the first one doesn't. Generators are generally slower than the equivalent list if the equivalent list fits in memory.

In [4]: timeit for i in [i for i in range(1000)]: pass
10000 loops, best of 3: 47.2 µs per loop

In [5]: timeit for i in (i for i in range(1000)): pass
10000 loops, best of 3: 57.8 µs per loop
like image 26
NightShadeQueen Avatar answered Oct 05 '22 22:10

NightShadeQueen