Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why do these two functions have the same bytecode when disassembled under dis.dis?

What follows is four functions that have the same output, but either written with a list comprehension or a tight loop, and a function call to vs an inline condition.

Interestingly, a and b have the same bytecode when disassembled, however b is much faster than a.

Moreover, d, which uses a tight loop with no function call, is faster than a which uses a list comprehension with a function call.

Why do functions a and b have the same bytecode, and why does b perform much better than a given the same bytecode?

import dis

def my_filter(n):
    return n < 5

def a():
    # list comprehension with function call
    return [i for i in range(10) if my_filter(i)]

def b():
    # list comprehension without function call
    return [i for i in range(10) if i < 5]

def c():
    # tight loop with function call
    values = []
    for i in range(10):
        if my_filter(i):
            values.append(i)
    return values

def d():
    # tight loop without function call
    values = []
    for i in range(10):
        if i < 5:
            values.append(i)
    return values

assert a() == b() == c() == d()

import sys
>>> sys.version_info[:]
(3, 6, 5, 'final', 0)

# list comprehension with function call
>>> dis.dis(a)
  2           0 LOAD_CONST               1 (<code object <listcomp> at 0x00000211CBE8B300, file "<stdin>", line 2>)
              2 LOAD_CONST               2 ('a.<locals>.<listcomp>')
              4 MAKE_FUNCTION            0
              6 LOAD_GLOBAL              0 (range)
              8 LOAD_CONST               3 (10)
             10 CALL_FUNCTION            1
             12 GET_ITER
             14 CALL_FUNCTION            1
             16 RETURN_VALUE

# list comprehension without function call
>>> dis.dis(b)
  2           0 LOAD_CONST               1 (<code object <listcomp> at 0x00000211CBB64270, file "<stdin>", line 2>)
              2 LOAD_CONST               2 ('b.<locals>.<listcomp>')
              4 MAKE_FUNCTION            0
              6 LOAD_GLOBAL              0 (range)
              8 LOAD_CONST               3 (10)
             10 CALL_FUNCTION            1
             12 GET_ITER
             14 CALL_FUNCTION            1
             16 RETURN_VALUE

# a and b have the same byte code?
# Why doesn't a have a LOAD_GLOBAL (my_filter) and CALL_FUNCTION?
# c below has both of these    

# tight loop with function call
>>> dis.dis(c)
  2           0 BUILD_LIST               0
              2 STORE_FAST               0 (values)

  3           4 SETUP_LOOP              34 (to 40)
              6 LOAD_GLOBAL              0 (range)
              8 LOAD_CONST               1 (10)
             10 CALL_FUNCTION            1
             12 GET_ITER
        >>   14 FOR_ITER                22 (to 38)
             16 STORE_FAST               1 (i)

  4          18 LOAD_GLOBAL              1 (my_filter)
             20 LOAD_FAST                1 (i)
             22 CALL_FUNCTION            1
             24 POP_JUMP_IF_FALSE       14

  5          26 LOAD_FAST                0 (values)
             28 LOAD_ATTR                2 (append)
             30 LOAD_FAST                1 (i)
             32 CALL_FUNCTION            1
             34 POP_TOP
             36 JUMP_ABSOLUTE           14
        >>   38 POP_BLOCK

  6     >>   40 LOAD_FAST                0 (values)
             42 RETURN_VALUE

# tight loop without function call
>>> dis.dis(d)
  2           0 BUILD_LIST               0
              2 STORE_FAST               0 (values)

  3           4 SETUP_LOOP              34 (to 40)
              6 LOAD_GLOBAL              0 (range)
              8 LOAD_CONST               1 (10)
             10 CALL_FUNCTION            1
             12 GET_ITER
        >>   14 FOR_ITER                22 (to 38)
             16 STORE_FAST               1 (i)

  4          18 LOAD_FAST                1 (i)
             20 LOAD_CONST               2 (5)
             22 COMPARE_OP               0 (<)
             24 POP_JUMP_IF_FALSE       14

  5          26 LOAD_FAST                0 (values)
             28 LOAD_ATTR                1 (append)
             30 LOAD_FAST                1 (i)
             32 CALL_FUNCTION            1
             34 POP_TOP
             36 JUMP_ABSOLUTE           14
        >>   38 POP_BLOCK

  6     >>   40 LOAD_FAST                0 (values)
             42 RETURN_VALUE

import timeit

>>> timeit.timeit(a)  # list comprehension with my_filter
1.2435139456834463
>>> timeit.timeit(b)  # list comprehension without my_filter
0.6717423789164627
>>> timeit.timeit(c)  # no list comprehension with my_filter
1.326850592144865
>>> timeit.timeit(d)  # no list comprehension no my_filter
0.7743895521070954

Why do a and b have the same byte code when disassembled? I would have expected b to have better looking bytecode. Notably I would have thought that a would need a LOAD_GLOBAL ? (my_filter) and a CALL FUNCTION. For example, c is the same as a but without the list comprehension, and it uses these bytecodes these on addresses 18 and 22.

However, even with the same bytecode, b performs much better than a. What's going on here?

Even more interesting, d, which uses a tight loop but does not have the call to my_filter, is faster than b which uses the list comprehension but has the call to my_filter. It looks like the overhead of using a function outweighs the overhead of a tight loop.

My goal here is to try to figure out if I can factor out conditions of a list comprehension into a function to make the list comprehension easier to read.

like image 750
Matthew Moisen Avatar asked Aug 25 '19 16:08

Matthew Moisen


People also ask

What is disassembled bytecode?

The disassembler converts the byte-compiled code into human-readable form. The byte-code interpreter is implemented as a simple stack machine. It pushes values onto a stack of its own, then pops them off to use them in calculations whose results are themselves pushed back on the stack.

What does dis mean in Python?

Source code: Lib/dis.py. The dis module supports the analysis of CPython bytecode by disassembling it. The CPython bytecode which this module takes as an input is defined in the file Include/opcode. h and used by the compiler and the interpreter.

Can you disassemble Python code?

Disassemble Your Python CodeWe can disassemble our code using Python's dis module. This is what disassembling a simple function looks like: We imported the dis module and then called the dis() method to disassemble my_very_special_function . Remember not to include round braces.

How does Python bytecode work?

The bytecode is a low-level platform-independent representation of your source code, however, it is not the binary machine code and cannot be run by the target machine directly. In fact, it is a set of instructions for a virtual machine which is called the Python Virtual Machine (PVM).


2 Answers

Note that both bytecodes for a and b only run <listcomp> objects defined elsewhere.

2           0 LOAD_CONST               1 (<code object <listcomp> at 0x00000211CBE8B300, file "<stdin>", line 2>)

Since the wrapper functions a and b are identical, their bytecodes are the same, only the addresses of listcomps are different.

In python 3.7 the dis module also prints the listcomps, here's the complete code and the output:

import sys
import dis

def my_filter(n):
    return n < 5

def a():
    # list comprehension with function call
    return [i for i in range(10) if my_filter(i)]

def b():
    # list comprehension without function call
    return [i for i in range(10) if i < 5]

print(sys.version)
print('-' * 70)
dis.dis(a)
print('-' * 70)
dis.dis(b)

--

3.7.3 (default, May 19 2019, 21:16:26) 
[Clang 10.0.1 (clang-1001.0.46.4)]
----------------------------------------------------------------------
  9           0 LOAD_CONST               1 (<code object <listcomp> at 0x1065c61e0, file "/w/test/x.py", line 9>)
              2 LOAD_CONST               2 ('a.<locals>.<listcomp>')
              4 MAKE_FUNCTION            0
              6 LOAD_GLOBAL              0 (range)
              8 LOAD_CONST               3 (10)
             10 CALL_FUNCTION            1
             12 GET_ITER
             14 CALL_FUNCTION            1
             16 RETURN_VALUE

Disassembly of <code object <listcomp> at 0x1065c61e0, file "/w/test/x.py", line 9>:
  9           0 BUILD_LIST               0
              2 LOAD_FAST                0 (.0)
        >>    4 FOR_ITER                16 (to 22)
              6 STORE_FAST               1 (i)
              8 LOAD_GLOBAL              0 (my_filter)
             10 LOAD_FAST                1 (i)
             12 CALL_FUNCTION            1
             14 POP_JUMP_IF_FALSE        4
             16 LOAD_FAST                1 (i)
             18 LIST_APPEND              2
             20 JUMP_ABSOLUTE            4
        >>   22 RETURN_VALUE
----------------------------------------------------------------------
 13           0 LOAD_CONST               1 (<code object <listcomp> at 0x1066188a0, file "/w/test/x.py", line 13>)
              2 LOAD_CONST               2 ('b.<locals>.<listcomp>')
              4 MAKE_FUNCTION            0
              6 LOAD_GLOBAL              0 (range)
              8 LOAD_CONST               3 (10)
             10 CALL_FUNCTION            1
             12 GET_ITER
             14 CALL_FUNCTION            1
             16 RETURN_VALUE

Disassembly of <code object <listcomp> at 0x1066188a0, file "/w/test/x.py", line 13>:
 13           0 BUILD_LIST               0
              2 LOAD_FAST                0 (.0)
        >>    4 FOR_ITER                16 (to 22)
              6 STORE_FAST               1 (i)
              8 LOAD_FAST                1 (i)
             10 LOAD_CONST               0 (5)
             12 COMPARE_OP               0 (<)
             14 POP_JUMP_IF_FALSE        4
             16 LOAD_FAST                1 (i)
             18 LIST_APPEND              2
             20 JUMP_ABSOLUTE            4
        >>   22 RETURN_VALUE

For pythons < 3.7. see Python: analyze a list comprehension with dis

like image 149
georg Avatar answered Nov 03 '22 02:11

georg


List-Comprehensions are converted to inner functions, because they built a separate namespace. The inner functions for the LC in a and b differ:

>>> dis.dis(a.__code__.co_consts[1])
  3           0 BUILD_LIST               0
              2 LOAD_FAST                0 (.0)
        >>    4 FOR_ITER                16 (to 22)
              6 STORE_FAST               1 (i)
              8 LOAD_GLOBAL              0 (my_filter)
             10 LOAD_FAST                1 (i)
             12 CALL_FUNCTION            1
             14 POP_JUMP_IF_FALSE        4
             16 LOAD_FAST                1 (i)
             18 LIST_APPEND              2
             20 JUMP_ABSOLUTE            4
        >>   22 RETURN_VALUE
>>> dis.dis(b.__code__.co_consts[1])
  3           0 BUILD_LIST               0
              2 LOAD_FAST                0 (.0)
        >>    4 FOR_ITER                16 (to 22)
              6 STORE_FAST               1 (i)
              8 LOAD_FAST                1 (i)
             10 LOAD_CONST               0 (5)
             12 COMPARE_OP               0 (<)
             14 POP_JUMP_IF_FALSE        4
             16 LOAD_FAST                1 (i)
             18 LIST_APPEND              2
             20 JUMP_ABSOLUTE            4
        >>   22 RETURN_VALUE

There you see the function call in a and the comparision in b.

like image 34
Daniel Avatar answered Nov 03 '22 04:11

Daniel