A coworker recently wrote a program in which he used a Python list as a queue. In other words, he used .append(x)
when needing to insert items and .pop(0)
when needing to remove items.
I know that Python has collections.deque
and I'm trying to figure out whether to spend my (limited) time to rewrite this code to use it. Assuming that we perform millions of appends and pops but never have more than a few thousand entries, will his list usage be a problem?
In particular, will the underlying array used by the Python list implementation continue to grow indefinitely have millions of spots even though the list only has a thousand things, or will Python eventually do a realloc
and free up some of that memory?
Python3 Queue is much much slower than list - Codeforces.
List is a Python's built-in data structure that can be used as a queue. Instead of enqueue() and dequeue(), append() and pop() function is used.
So swapping a queue, even at a relatively small size of 100 items, is 5x faster. And the larger the queue, the more performance we can squeeze out of not using a list.
deque is an excellent default choice for implementing a FIFO queue data structure in Python.
Some answers claimed a "10x" speed advantage for deque vs list-used-as-FIFO when both have 1000 entries, but that's a bit of an overbid:
$ python -mtimeit -s'q=range(1000)' 'q.append(23); q.pop(0)'
1000000 loops, best of 3: 1.24 usec per loop
$ python -mtimeit -s'import collections; q=collections.deque(range(1000))' 'q.append(23); q.popleft()'
1000000 loops, best of 3: 0.573 usec per loop
python -mtimeit
is your friend -- a really useful and simple micro-benchmarking approach! With it you can of course also trivially explore performance in much-smaller cases:
$ python -mtimeit -s'q=range(100)' 'q.append(23); q.pop(0)'
1000000 loops, best of 3: 0.972 usec per loop
$ python -mtimeit -s'import collections; q=collections.deque(range(100))' 'q.append(23); q.popleft()'
1000000 loops, best of 3: 0.576 usec per loop
(not very different for 12 instead of 100 items btw), and in much-larger ones:
$ python -mtimeit -s'q=range(10000)' 'q.append(23); q.pop(0)'
100000 loops, best of 3: 5.81 usec per loop
$ python -mtimeit -s'import collections; q=collections.deque(range(10000))' 'q.append(23); q.popleft()'
1000000 loops, best of 3: 0.574 usec per loop
You can see that the claim of O(1) performance for deque is well founded, while a list is over twice as slow around 1,000 items, an order of magnitude around 10,000. You can also see that even in such cases you're only wasting 5 microseconds or so per append/pop pair and decide how significant that wastage is (though if that's all you're doing with that container, deque has no downside, so you might as well switch even if 5 usec more or less won't make an important difference).
You won't run out of memory using the list
implementation, but performance will be poor. From the docs:
Though
list
objects support similar operations, they are optimized for fast fixed-length operations and incur O(n) memory movement costs forpop(0)
andinsert(0, v)
operations which change both the size and position of the underlying data representation.
So using a deque
will be much faster.
From Beazley's Python Essential Reference, Fourth Edition, p. 194:
Some library modules provide new types that outperform the built-ins at certain tasks. For instance, collections.deque type provides similar functionality to a list but has been highly optimized for the insertion of items at both ends. A list, in contrast, is only efficient when appending items at the end. If you insert items at the front, all of the other elements need to be shifted in order to make room. The time required to do this grows as the list gets larger and larger. Just to give you an idea of the difference, here is a timing measurement of inserting one million items at the front of a list and a deque:
And there follows this code sample:
>>> from timeit import timeit
>>> timeit('s.appendleft(37)', 'import collections; s = collections.deque()', number=1000000)
0.13162776274638258
>>> timeit('s.insert(0,37)', 's = []', number=1000000)
932.07849908298408
Timings are from my machine.
2012-07-01 Update
>>> from timeit import timeit
>>> n = 1024 * 1024
>>> while n > 1:
... print '-' * 30, n
... timeit('s.appendleft(37)', 'import collections; s = collections.deque()', number=n)
... timeit('s.insert(0,37)', 's = []', number=n)
... n >>= 1
...
------------------------------ 1048576
0.1239769458770752
799.2552740573883
------------------------------ 524288
0.06924104690551758
148.9747350215912
------------------------------ 262144
0.029170989990234375
35.077512979507446
------------------------------ 131072
0.013737916946411133
9.134140014648438
------------------------------ 65536
0.006711006164550781
1.8818109035491943
------------------------------ 32768
0.00327301025390625
0.48307204246520996
------------------------------ 16384
0.0016388893127441406
0.11021995544433594
------------------------------ 8192
0.0008249282836914062
0.028419017791748047
------------------------------ 4096
0.00044918060302734375
0.00740504264831543
------------------------------ 2048
0.00021195411682128906
0.0021741390228271484
------------------------------ 1024
0.00011205673217773438
0.0006101131439208984
------------------------------ 512
6.198883056640625e-05
0.00021386146545410156
------------------------------ 256
2.9087066650390625e-05
8.797645568847656e-05
------------------------------ 128
1.5974044799804688e-05
3.600120544433594e-05
------------------------------ 64
8.821487426757812e-06
1.9073486328125e-05
------------------------------ 32
5.0067901611328125e-06
1.0013580322265625e-05
------------------------------ 16
3.0994415283203125e-06
5.9604644775390625e-06
------------------------------ 8
3.0994415283203125e-06
5.0067901611328125e-06
------------------------------ 4
3.0994415283203125e-06
4.0531158447265625e-06
------------------------------ 2
2.1457672119140625e-06
2.86102294921875e-06
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With