Yesterday I came across this odd unpacking difference between Python 2 and Python 3, and did not seem to find any explanation after a quick Google search.
Python 2.7.8
a = 257
b = 257
a is b # False
a, b = 257, 257
a is b # False
Python 3.4.2
a = 257
b = 257
a is b # False
a, b = 257, 257
a is b # True
I know it probably does not affect the correctness of a program, but it does bug me a little. Could anyone give some insights about this difference in unpacking?
Python 3 is more in-demand and includes a typing system. Python 2 is outdated and uses an older syntax for the print function. While Python 2 is still in use for configuration management in DevOps, Python 3 is the current standard. Python (the code, not the snake) is a popular coding language to learn for beginners.
Python 3.0 uses the concepts of text and (binary) data instead of Unicode strings and 8-bit strings. All text is Unicode; however encoded Unicode is represented as binary data. The type used to hold text is str , the type used to hold data is bytes . The biggest difference with the 2.
To use the Python 3 processor for Python code within a program block, use BEGIN PROGRAM PYTHON3-END PROGRAM . By default, Python scripts that are run from the SCRIPT command are run with the Python 2 processor. To run a script that uses the Python 3 processor, use PYTHONVERSION=3 on the SCRIPT command.
This behaviour is at least in part to do with how the interpreter does constant folding and how the REPL executes code.
First, remember that CPython first compiles code (to AST and then bytecode). It then evaluates the bytecode. During compilation, the script looks for objects that are immutable and caches them. It also deduplicates them. So if it sees
a = 257
b = 257
it will store a and b against the same object:
import dis
def f():
a = 257
b = 257
dis.dis(f)
#>>> 4 0 LOAD_CONST 1 (257)
#>>> 3 STORE_FAST 0 (a)
#>>>
#>>> 5 6 LOAD_CONST 1 (257)
#>>> 9 STORE_FAST 1 (b)
#>>> 12 LOAD_CONST 0 (None)
#>>> 15 RETURN_VALUE
Note the LOAD_CONST 1
. The 1
is the index into co_consts
:
f.__code__.co_consts
#>>> (None, 257)
So these both load the same 257
. Why doesn't this occur with:
$ python2
Python 2.7.8 (default, Sep 24 2014, 18:26:21)
>>> a = 257
>>> b = 257
>>> a is b
False
$ python3
Python 3.4.2 (default, Oct 8 2014, 13:44:52)
>>> a = 257
>>> b = 257
>>> a is b
False
?
Each line in this case is a separate compilation unit and the deduplication cannot happen across them. It works similarly to
compile a = 257
run a = 257
compile b = 257
run b = 257
compile a is b
run a is b
As such, these code objects will both have unique constant caches.
This implies that if we remove the line break, the is
will return True
:
>>> a = 257; b = 257
>>> a is b
True
Indeed this is the case for both Python versions. In fact, this is exactly why
>>> a, b = 257, 257
>>> a is b
True
returns True
as well; it's not because of any attribute of unpacking; they
just get placed in the same compilation unit.
This returns False
for versions which don't fold properly; filmor links to Ideone which shows this failing on 2.7.3 and 3.2.3. On these versions, the tuples created do not share their items with the other constants:
import dis
def f():
a, b = 257, 257
print(a is b)
print(f.__code__.co_consts)
#>>> (None, 257, (257, 257))
n = f.__code__.co_consts[1]
n1 = f.__code__.co_consts[2][0]
n2 = f.__code__.co_consts[2][1]
print(id(n), id(n1), id(n2))
#>>> (148384292, 148384304, 148384496)
Again, though, this is not about a change in how the objects are unpacked; it is only a change in how the objects are stored in co_consts
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With