Could someone explain me this strange result on python 2.6.6 ?
>>> a = "xx"
>>> b = "xx"
>>> a.__hash__() == b.__hash__()
True
>>> a is b
True # ok.. was just to be sure
>>> a = "x" * 2
>>> b = "x" * 2
>>> a.__hash__() == b.__hash__()
True
>>> a is b
True # yeah.. looks ok so far !
>>> n = 2
>>> a = "x" * n
>>> b = "x" * n
>>> a.__hash__() == b.__hash__()
True # still okay..
>>> a is b
False # hey! What the F... ?
The is
operator tells you whether two variables point to the same object in memory. It is rarely useful and often confused with the ==
operator, which tells you whether two objects "look the same".
It is particularly confusing when used with things like short string literals, because the Python compiler interns these for efficiency. In other words, when you write "xx"
the compiler (emits bytecode that) creates one string object in memory and causes all literals "xx"
to point to it. This explains why your first two comparisons are True. Notice that you can get the id of the strings by calling id
on them, which (at least on CPython is probably) their address in memory:
>>> a = "xx"
>>> b = "xx"
>>> id(a)
38646080
>>> id(b)
38646080
>>> a is b
True
>>> a = "x"*10000
>>> b = "x"*10000
>>> id(a)
38938560
>>> id(b)
38993504
>>> a is b
False
The third is because the compiler hasn't interned the strings a
and b
, for whatever reason (probably because it isn't smart enough to notice that the variable n
is defined once and then never modified).
You can in fact force Python to intern strings by, well, asking it to. This will give you a piddling amount of performance increase and might help. It's probably useless.
Moral: don't use is
with string literals. Or int literals. Or anywhere you don't mean it, really.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With