I came across a confusing problem when unit testing a module. The module is actually casting values and I want to compare this values.
There is a difference in comparison with ==
and is
(partly, I'm beware of the difference)
>>> 0.0 is 0.0
True # as expected
>>> float(0.0) is 0.0
True # as expected
As expected till now, but here is my "problem":
>>> float(0) is 0.0
False
>>> float(0) is float(0)
False
Why? At least the last one is really confusing to me. The internal representation of float(0)
and float(0.0)
should be equal. Comparison with ==
is working as expected.
This has to do with how is
works. It checks for references instead of value. It returns True
if either argument is assigned to the same object.
In this case, they are different instances; float(0)
and float(0)
have the same value ==
, but are distinct entities as far as Python is concerned. CPython implementation also caches integers as singleton objects in this range -> [x | x ∈ ℤ ∧ -5 ≤ x ≤ 256 ]:
>>> 0.0 is 0.0
True
>>> float(0) is float(0) # Not the same reference, unique instances.
False
In this example we can demonstrate the integer caching principle:
>>> a = 256
>>> b = 256
>>> a is b
True
>>> a = 257
>>> b = 257
>>> a is b
False
Now, if floats are passed to float()
, the float literal is simply returned (short-circuited), as in the same reference is used, as there's no need to instantiate a new float from an existing float:
>>> 0.0 is 0.0
True
>>> float(0.0) is float(0.0)
True
This can be demonstrated further by using int()
also:
>>> int(256.0) is int(256.0) # Same reference, cached.
True
>>> int(257.0) is int(257.0) # Different references are returned, not cached.
False
>>> 257 is 257 # Same reference.
True
>>> 257.0 is 257.0 # Same reference. As @Martijn Pieters pointed out.
True
However, the results of is
are also dependant on the scope it is being executed in (beyond the span of this question/explanation), please refer to user: @Jim's fantastic explanation on code objects. Even python's doc includes a section on this behavior:
[7] Due to automatic garbage-collection, free lists, and the dynamic nature of descriptors, you may notice seemingly unusual behaviour in certain uses of the
is
operator, like those involving comparisons between instance methods, or constants. Check their documentation for more info.
If a float
object is supplied to float()
, CPython* just returns it without making a new object.
This can be seen in PyNumber_Float
(which is eventually called from float_new
) where the object o
passed in is checked with PyFloat_CheckExact
; if True
, it just increases its reference count and returns it:
if (PyFloat_CheckExact(o)) {
Py_INCREF(o);
return o;
}
As a result, the id
of the object stays the same. So the expression
>>> float(0.0) is float(0.0)
reduces to:
>>> 0.0 is 0.0
But why does that equal True
? Well, CPython
has some small optimizations.
In this case, it uses the same object for the two occurrences of 0.0
in your command because they are part of the same code
object (short disclaimer: they're on the same logical line); so the is
test will succeed.
This can be further corroborated if you execute float(0.0)
in separate lines (or, delimited by ;
) and then check for identity:
a = float(0.0); b = float(0.0) # Python compiles these separately
a is b # False
On the other hand, if an int
(or a str
) is supplied, CPython will create a new float
object from it and return that. For this, it uses PyFloat_FromDouble
and PyFloat_FromString
respectively.
The effect is that the returned objects differ in id
s (which used to check identities with is
):
# Python uses the same object representing 0 to the calls to float
# but float returns new float objects when supplied with ints
# Thereby, the result will be False
float(0) is float(0)
*Note: All previous mentioned behavior applies for the implementation of python in C
i.e CPython
. Other implementations might exhibit different behavior. In short, don't depend on it.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With