I ran into an unpredicted behavior with Python's Numpy, set and NaN (not-a-number):
>>> set([np.float64('nan'), np.float64('nan')])
set([nan, nan])
>>> set([np.float32('nan'), np.float32('nan')])
set([nan, nan])
>>> set([np.float('nan'), np.float('nan')])
set([nan, nan])
>>> set([np.nan, np.nan])
set([nan])
>>> set([float('nan'), float('nan')])
set([nan, nan])
Here np.nan yields a single element set, while Numpy's nans yield multiple nans in a set. So does float('nan')! And note that:
>>> type(float('nan')) == type(np.nan)
True
I wonder how this difference come about and what the rationality is behind the different behaviors.
One of the properties of NAN is that NAN != NAN, unlike all other numbers. However, the implementation of set
first checks to see if id(x) matches the existing member at a hash index before it tries to insert a new one. If you have two objects with different ids that both have the value NAN, you'll get two entries in the set. If they both have the same id, they collapse into a single entry.
As pointed out by others, np.nan
is a single object that will always have the same id.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With