In my opinion, things like float('nan')
should be optimized, but apparently they aren't in Python.
>>> NaN = float('nan')
>>> a = [ 1, 2, 3, NaN ]
>>> NaN in a
True
>>> float('nan') in a
False
Does it have any meaning with not optimizing nan
like other things?
In my thought, nan
is only nan
.
As well as this, when you use sorted
on these things, they give weird results:
>>> sorted([3, nan, 4, 2, nan, 1])
[3, nan, 1, 2, 4, nan]
>>> 3 > float('nan')
False
>>> 3 < float('nan')
False
The comparison on nan
is defined like this, but it doesn't seems 'pythonic' to me. Why doesn't it raise an error?
In Python, the float type has nan . nan stands for "not a number" and is defined by the IEEE 754 floating-point standard.
When programmers refer to raise an error it means to catch an unexpected behaviour what something goes wrong. As a simple example in Python: int('a') >> ----> 1 int('a') ValueError: invalid literal for int() with base 10: 'a'
Membership testing
Two different instances of float('nan')
are not equal to each other. They are "Not a Number" so it makes sense that they shouldn't also have to be equal. They are different instances of objects which are not numbers:
print(float('nan') == float('nan')) # False
As documented here:
For container types such as list, tuple, set, frozenset, dict, or collections.deque, the expression x in y is equivalent to any(x is e or x == e for e in y).
There is a checking for identity! that's why you see that behavior in your question and why NaN in a
returns True
and float('nan') in a
doesn't.
Sorting in Python
Python uses the Timsort algorithm for its sorted()
function. (Also see this for a textual explanation.) I'm not going to go into that. I just want to demonstrate a simple example:
This is my class A
. It's going to be our float('nan')
object. It acts like float('nan')
in that it returns False
for all comparison operations:
class A:
def __init__(self, n):
self.n = n
def __lt__(self, other):
print(self, 'lt is calling', other)
return False
def __gt__(self, other):
print(self, 'gt is calling', other)
return False
def __repr__(self):
return f'A({self.n})'
class B:
def __init__(self, n):
self.n = n
def __lt__(self, other):
print(self, 'lt is calling', other)
return False
def __gt__(self, other):
print(self, 'gt is calling', other)
return False
def __repr__(self):
return f'B({self.n})'
When we use the sorted()
function (or the .sort()
method of a list
) without the reverse=True
argument, we're asking for the iterable to be sorted in ascending order. To do this, Python tries to call the __lt__
method successively, starting from the second object in the list to see if it is less than its previous object and so on:
lst = [A(1), B(2), A(3), B(4)]
print(sorted(lst))
output :
B(2) lt is calling A(1)
A(3) lt is calling B(2)
B(4) lt is calling A(3)
[A(1), B(2), A(3), B(4)]
Now, switching back to your example:
lst = [3, A(1), 4, 2, A(1), 1]
print(sorted(lst))
output:
A(1) lt is calling 3
A(1) gt is calling 4
A(1) gt is calling 2
A(1) lt is calling 2
A(1) lt is calling 4
A(1) gt is calling 1
[3, A(1), 1, 2, 4, A(1)]
A(1).__lt__(3)
will return False
. This means A(1)
is not less
than 3 or This means 3
is in correct position relative to A(1)
.int.__lt__(4, A(1))
gets called and because it returns
NotImplemented
object, Python checks to see if A(1)
has
implemented __gt__
and yes, so A(1).__gt__(4)
will return
False
again and this means the A(1)
object is in correct place
relative to 4
.This is why the result of sorted()
seems to be weird, but it's predictable. A(1)
object in both cases, I mean when int
class returns NotImplemented
and when __lt__
gets called from A(1)
, will return False.
It's better to check the Timsort algorithm and consider those points. I would include the remaining steps if I read Timsort algorithm carefully.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With