I am surprised by the following behavior:
>>> import numpy as np
>>> from collections import Counter
>>> my_list = [1,2,2, np.nan, np.nan]
>>> Counter(my_list)
Counter({nan: 2, 2: 2, 1: 1}) # Counter treats np.nan as equal and
# tells me that I have two of them
>>> np.nan == np.nan # However, np.nan's are not equal
False
What is going on here?
When I use float('nan')
instead of np.nan
, I get the behavior I expect:
>>> my_list = [1,2,2, float('nan'), float('nan')]
>>> Counter(my_list)
Counter({2: 2, nan: 1, 1: 1, nan: 1}) # two different nan's
>>> float('nan') == float('nan')
False
I am using python 2.7.3
and numpy 1.8.1
.
Edit:
If I do:
>>> a = 300
>>> b = 300
>>> a is b
False
>>> Counter([a, b])
Counter({300: 2})
So, Counter
or any python dict
considers two objects X
and Y
not the same if:
X == Y -> False
and
X is Y -> False
correct?
This isn't about numpy.nan
vs. float("nan")
, it's that you've got two separate float nans.
>>> np.nan is np.nan
True
>>> float("nan") is float("nan")
False
and so
>>> Counter([1,2,2, np.nan, np.nan])
Counter({nan: 2, 2: 2, 1: 1})
>>> Counter([1,2,2, float("nan"), float("nan")])
Counter({2: 2, nan: 1, 1: 1, nan: 1})
but
>>> f = float("nan")
>>> Counter([1,2,2, f, f])
Counter({nan: 2, 2: 2, 1: 1})
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With