In Python (2.7.2),why does
import dis
dis.dis("i in (2, 3)")
works as expected whereas
import dis
dis.dis("i in [2, 3]")
raises:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/lib/python2.7/dis.py", line 45, in dis
disassemble_string(x)
File "/usr/lib/python2.7/dis.py", line 112, in disassemble_string
labels = findlabels(code)
File "/usr/lib/python2.7/dis.py", line 166, in findlabels
oparg = ord(code[i]) + ord(code[i+1])*256
IndexError: string index out of range
Note that this doesn't affect Python3.
In Python 2.x, the str
type holds raw bytes, so dis
assumes that if you pass it a string it is getting compiled bytecode. It tries to disassemble the string you pass it as bytecode and -- purely due to the implementation details of Python bytecode -- succeeds for i in (2,3)
. Obviously, though, it returns gibberish.
In Python 3.x, the str
type is for strings and the bytes
types is for raw bytes, so dis
can distinguish between compiled bytecode and strings -- and assumes it is getting source code if it gets a string.
Here's the thought process I followed to work this one out.
I tried it on my Python (3.2):
>>> import dis
>>> dis.dis("i in (2,3)")
1 0 LOAD_NAME 0 (i)
3 LOAD_CONST 2 ((2, 3))
6 COMPARE_OP 6 (in)
9 RETURN_VALUE
>>> dis.dis("i in [2,3]")
1 0 LOAD_NAME 0 (i)
3 LOAD_CONST 2 ((2, 3))
6 COMPARE_OP 6 (in)
9 RETURN_VALUE
Obviously, this works.
I tried it on Python 2.7:
>>> import dis
>>> dis.dis("i in (2,3)")
0 BUILD_MAP 26912
3 JUMP_FORWARD 10272 (to 10278)
6 DELETE_SLICE+0
7 <44>
8 DELETE_SLICE+1
9 STORE_SLICE+1
>>> dis.dis("i in [2,3]")
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "C:\Python27\lib\dis.py", line 45, in dis
disassemble_string(x)
File "C:\Python27\lib\dis.py", line 112, in disassemble_string
labels = findlabels(code)
File "C:\Python27\lib\dis.py", line 166, in findlabels
oparg = ord(code[i]) + ord(code[i+1])*256
IndexError: string index out of range
Aha! Notice also that the generated bytecode in Python 3.2 is what you would expect ("load i
, load (2,3)
, test for membership, return the result") whereas what you have got in Python 2.7 is gibberish. Clearly, dis
is decompiling the string as bytecode in 2.7 but compiling it as Python in 3.2.
I had a look in the source code for dis.dis
. Here are the key points:
Python 2.7:
elif isinstance(x, str):
disassemble_string(x)
Python 3.2:
elif isinstance(x, (bytes, bytearray)): # Raw bytecode
_disassemble_bytes(x)
elif isinstance(x, str): # Source code
_disassemble_str(x)
Just for fun, let's check this by passing the same bytes to dis
in Python 3:
>>> dis.dis("i in (2,3)".encode())
0 BUILD_MAP 26912
3 JUMP_FORWARD 10272 (to 10278)
6 <50>
7 <44>
8 <51>
9 <41>
Aha! Gibberish! (Though note that it's slightly different gibberish -- the bytecode has changed with the Python version!)
dis.dis
expects bytecode as an argument, not python source code. Although your first example "works", it doesn't provide any meaningful output. You probably want:
import compiler, dis
code = compiler.compile("i in [2, 3]", '', 'single')
dis.dis(code)
This works as expected. (I tested in 2.7 only).
If you are just trying to get bytecode for a simple expression, passing it to dis as a lambda with your expression as the lambda's body is the simplest:
>>> import dis
>>> dis.dis(lambda i : i in [3,2])
1 0 LOAD_FAST 0 (i)
3 LOAD_CONST 2 ((3, 2))
6 COMPARE_OP 6 (in)
9 RETURN_VALUE
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With