This is interesting; list() to force an iterator to get the actual list is so much faster than [x for x in someList] (comprehension).
Is this for real or is my test just too simple? Below is the code:
import time
timer = time.clock()
for i in xrange(90):
#localList = [x for x in xrange(1000000)] #Very slow, took me 6.8s
localList = list(xrange(1000000)) #Very fast, took me 0.9s
print localList[999999] #make sure list is really evaluated.
print "Total time: ", time.clock() - timer
The list comprehension executes the loop in Python bytecode, just like a regular for loop.
The list() call iterates entirely in C code, which is far faster.
The bytecode for the list comprehension looks like this:
>>> import dis
>>> dis.dis(compile("[x for x in xrange(1000000)]", '<stdin>', 'exec'))
1 0 BUILD_LIST 0
3 LOAD_NAME 0 (xrange)
6 LOAD_CONST 0 (1000000)
9 CALL_FUNCTION 1
12 GET_ITER
>> 13 FOR_ITER 12 (to 28)
16 STORE_NAME 1 (x)
19 LOAD_NAME 1 (x)
22 LIST_APPEND 2
25 JUMP_ABSOLUTE 13
>> 28 POP_TOP
29 LOAD_CONST 1 (None)
32 RETURN_VALUE
The >> pointers roughly give you the boundaries of the loop being executed, so you have 1 million STORE_NAME, LOAD_NAME and LIST_APPEND steps to execute in the Python bytecode evaluation loop.
list() on the other hand just grabs the values from the xrange() iterable directly by using the C API for object iteration, and it can use the length of the xrange() object to pre-allocate the list object rather than grow it dynamically.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With