Why do these two functions have the same bytecode when disassembled under dis.dis?

Tags:

What follows is four functions that have the same output, but either written with a list comprehension or a tight loop, and a function call to vs an inline condition.

Interestingly, a and b have the same bytecode when disassembled, however b is much faster than a.

Moreover, d, which uses a tight loop with no function call, is faster than a which uses a list comprehension with a function call.

Why do functions a and b have the same bytecode, and why does b perform much better than a given the same bytecode?

import dis

def my_filter(n):
    return n < 5

def a():
    # list comprehension with function call
    return [i for i in range(10) if my_filter(i)]

def b():
    # list comprehension without function call
    return [i for i in range(10) if i < 5]

def c():
    # tight loop with function call
    values = []
    for i in range(10):
        if my_filter(i):
            values.append(i)
    return values

def d():
    # tight loop without function call
    values = []
    for i in range(10):
        if i < 5:
            values.append(i)
    return values

assert a() == b() == c() == d()

import sys
>>> sys.version_info[:]
(3, 6, 5, 'final', 0)

# list comprehension with function call
>>> dis.dis(a)
  2           0 LOAD_CONST               1 (<code object <listcomp> at 0x00000211CBE8B300, file "<stdin>", line 2>)
              2 LOAD_CONST               2 ('a.<locals>.<listcomp>')
              4 MAKE_FUNCTION            0
              6 LOAD_GLOBAL              0 (range)
              8 LOAD_CONST               3 (10)
             10 CALL_FUNCTION            1
             12 GET_ITER
             14 CALL_FUNCTION            1
             16 RETURN_VALUE

# list comprehension without function call
>>> dis.dis(b)
  2           0 LOAD_CONST               1 (<code object <listcomp> at 0x00000211CBB64270, file "<stdin>", line 2>)
              2 LOAD_CONST               2 ('b.<locals>.<listcomp>')
              4 MAKE_FUNCTION            0
              6 LOAD_GLOBAL              0 (range)
              8 LOAD_CONST               3 (10)
             10 CALL_FUNCTION            1
             12 GET_ITER
             14 CALL_FUNCTION            1
             16 RETURN_VALUE

# a and b have the same byte code?
# Why doesn't a have a LOAD_GLOBAL (my_filter) and CALL_FUNCTION?
# c below has both of these    

# tight loop with function call
>>> dis.dis(c)
  2           0 BUILD_LIST               0
              2 STORE_FAST               0 (values)

  3           4 SETUP_LOOP              34 (to 40)
              6 LOAD_GLOBAL              0 (range)
              8 LOAD_CONST               1 (10)
             10 CALL_FUNCTION            1
             12 GET_ITER
        >>   14 FOR_ITER                22 (to 38)
             16 STORE_FAST               1 (i)

  4          18 LOAD_GLOBAL              1 (my_filter)
             20 LOAD_FAST                1 (i)
             22 CALL_FUNCTION            1
             24 POP_JUMP_IF_FALSE       14

  5          26 LOAD_FAST                0 (values)
             28 LOAD_ATTR                2 (append)
             30 LOAD_FAST                1 (i)
             32 CALL_FUNCTION            1
             34 POP_TOP
             36 JUMP_ABSOLUTE           14
        >>   38 POP_BLOCK

  6     >>   40 LOAD_FAST                0 (values)
             42 RETURN_VALUE

# tight loop without function call
>>> dis.dis(d)
  2           0 BUILD_LIST               0
              2 STORE_FAST               0 (values)

  3           4 SETUP_LOOP              34 (to 40)
              6 LOAD_GLOBAL              0 (range)
              8 LOAD_CONST               1 (10)
             10 CALL_FUNCTION            1
             12 GET_ITER
        >>   14 FOR_ITER                22 (to 38)
             16 STORE_FAST               1 (i)

  4          18 LOAD_FAST                1 (i)
             20 LOAD_CONST               2 (5)
             22 COMPARE_OP               0 (<)
             24 POP_JUMP_IF_FALSE       14

  5          26 LOAD_FAST                0 (values)
             28 LOAD_ATTR                1 (append)
             30 LOAD_FAST                1 (i)
             32 CALL_FUNCTION            1
             34 POP_TOP
             36 JUMP_ABSOLUTE           14
        >>   38 POP_BLOCK

  6     >>   40 LOAD_FAST                0 (values)
             42 RETURN_VALUE

import timeit

>>> timeit.timeit(a)  # list comprehension with my_filter
1.2435139456834463
>>> timeit.timeit(b)  # list comprehension without my_filter
0.6717423789164627
>>> timeit.timeit(c)  # no list comprehension with my_filter
1.326850592144865
>>> timeit.timeit(d)  # no list comprehension no my_filter
0.7743895521070954

Why do a and b have the same byte code when disassembled? I would have expected b to have better looking bytecode. Notably I would have thought that a would need a LOAD_GLOBAL ? (my_filter) and a CALL FUNCTION. For example, c is the same as a but without the list comprehension, and it uses these bytecodes these on addresses 18 and 22.

However, even with the same bytecode, b performs much better than a. What's going on here?

Even more interesting, d, which uses a tight loop but does not have the call to my_filter, is faster than b which uses the list comprehension but has the call to my_filter. It looks like the overhead of using a function outweighs the overhead of a tight loop.

My goal here is to try to figure out if I can factor out conditions of a list comprehension into a function to make the list comprehension easier to read.

750

asked Aug 25 '19 16:08

Matthew Moisen

2 Answers

Note that both bytecodes for a and b only run <listcomp> objects defined elsewhere.

2           0 LOAD_CONST               1 (<code object <listcomp> at 0x00000211CBE8B300, file "<stdin>", line 2>)

Since the wrapper functions a and b are identical, their bytecodes are the same, only the addresses of listcomps are different.

In python 3.7 the dis module also prints the listcomps, here's the complete code and the output:

import sys
import dis

def my_filter(n):
    return n < 5

def a():
    # list comprehension with function call
    return [i for i in range(10) if my_filter(i)]

def b():
    # list comprehension without function call
    return [i for i in range(10) if i < 5]

print(sys.version)
print('-' * 70)
dis.dis(a)
print('-' * 70)
dis.dis(b)

3.7.3 (default, May 19 2019, 21:16:26) 
[Clang 10.0.1 (clang-1001.0.46.4)]
----------------------------------------------------------------------
  9           0 LOAD_CONST               1 (<code object <listcomp> at 0x1065c61e0, file "/w/test/x.py", line 9>)
              2 LOAD_CONST               2 ('a.<locals>.<listcomp>')
              4 MAKE_FUNCTION            0
              6 LOAD_GLOBAL              0 (range)
              8 LOAD_CONST               3 (10)
             10 CALL_FUNCTION            1
             12 GET_ITER
             14 CALL_FUNCTION            1
             16 RETURN_VALUE

Disassembly of <code object <listcomp> at 0x1065c61e0, file "/w/test/x.py", line 9>:
  9           0 BUILD_LIST               0
              2 LOAD_FAST                0 (.0)
        >>    4 FOR_ITER                16 (to 22)
              6 STORE_FAST               1 (i)
              8 LOAD_GLOBAL              0 (my_filter)
             10 LOAD_FAST                1 (i)
             12 CALL_FUNCTION            1
             14 POP_JUMP_IF_FALSE        4
             16 LOAD_FAST                1 (i)
             18 LIST_APPEND              2
             20 JUMP_ABSOLUTE            4
        >>   22 RETURN_VALUE
----------------------------------------------------------------------
 13           0 LOAD_CONST               1 (<code object <listcomp> at 0x1066188a0, file "/w/test/x.py", line 13>)
              2 LOAD_CONST               2 ('b.<locals>.<listcomp>')
              4 MAKE_FUNCTION            0
              6 LOAD_GLOBAL              0 (range)
              8 LOAD_CONST               3 (10)
             10 CALL_FUNCTION            1
             12 GET_ITER
             14 CALL_FUNCTION            1
             16 RETURN_VALUE

Disassembly of <code object <listcomp> at 0x1066188a0, file "/w/test/x.py", line 13>:
 13           0 BUILD_LIST               0
              2 LOAD_FAST                0 (.0)
        >>    4 FOR_ITER                16 (to 22)
              6 STORE_FAST               1 (i)
              8 LOAD_FAST                1 (i)
             10 LOAD_CONST               0 (5)
             12 COMPARE_OP               0 (<)
             14 POP_JUMP_IF_FALSE        4
             16 LOAD_FAST                1 (i)
             18 LIST_APPEND              2
             20 JUMP_ABSOLUTE            4
        >>   22 RETURN_VALUE

For pythons < 3.7. see Python: analyze a list comprehension with dis

149

answered Nov 03 '22 02:11

georg

List-Comprehensions are converted to inner functions, because they built a separate namespace. The inner functions for the LC in a and b differ:

>>> dis.dis(a.__code__.co_consts[1])
  3           0 BUILD_LIST               0
              2 LOAD_FAST                0 (.0)
        >>    4 FOR_ITER                16 (to 22)
              6 STORE_FAST               1 (i)
              8 LOAD_GLOBAL              0 (my_filter)
             10 LOAD_FAST                1 (i)
             12 CALL_FUNCTION            1
             14 POP_JUMP_IF_FALSE        4
             16 LOAD_FAST                1 (i)
             18 LIST_APPEND              2
             20 JUMP_ABSOLUTE            4
        >>   22 RETURN_VALUE
>>> dis.dis(b.__code__.co_consts[1])
  3           0 BUILD_LIST               0
              2 LOAD_FAST                0 (.0)
        >>    4 FOR_ITER                16 (to 22)
              6 STORE_FAST               1 (i)
              8 LOAD_FAST                1 (i)
             10 LOAD_CONST               0 (5)
             12 COMPARE_OP               0 (<)
             14 POP_JUMP_IF_FALSE        4
             16 LOAD_FAST                1 (i)
             18 LIST_APPEND              2
             20 JUMP_ABSOLUTE            4
        >>   22 RETURN_VALUE

There you see the function call in a and the comparision in b.

answered Nov 03 '22 04:11

Daniel

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Why do these two functions have the same bytecode when disassembled under dis.dis?

Tags:

python

python-3.x

bytecode

python-3.6

Matthew Moisen

People also ask

2 Answers

georg

Daniel

Recent Activity

Donate For Us

Why do these two functions have the same bytecode when disassembled under dis.dis?

Tags:

python

python-3.x

bytecode

python-3.6

Matthew Moisen

People also ask

2 Answers

georg

Daniel

Related questions

Recent Activity

Donate For Us