Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python\Numpy: Comparing arrays with NAN [duplicate]

Tags:

python

numpy

Why are the following two lists not equal?

a = [1.0, np.NAN] 
b = np.append(np.array(1.0), [np.NAN]).tolist()

I am using the following to check for identicalness.

((a == b) | (np.isnan(a) & np.isnan(b))).all(), np.in1d(a,b)

Using np.in1d(a, b) it seems the np.NAN values are not equal but I am not sure why this is. Can anyone shed some light on this issue?

like image 898
Black Avatar asked May 22 '14 14:05

Black


People also ask

How do I compare values in two NumPy arrays?

Method 1: We generally use the == operator to compare two NumPy arrays to generate a new array object. Call ndarray. all() with the new array object as ndarray to return True if the two NumPy arrays are equivalent.

What does NumPy do with NaN values?

nan and in NumPy NaN is defined automatically to replace the value in a data frame in which the values are missing or not mentioned in such cases in the data frame we can write as NaN or nan as a placeholder to represent the missing data in a data frame which is a floating-point number.

How do you compare NP and NaN?

To check for NaN values in a Numpy array you can use the np. isnan() method. This outputs a boolean mask of the size that of the original array. The output array has true for the indices which are NaNs in the original array and false for the rest.

Does NumPy support NaN?

Save this answer. Show activity on this post. No, you can't, at least with current version of NumPy. A nan is a special value for float arrays only.


2 Answers

NaN values never compare equal. That is, the test NaN==NaN is always False by definition of NaN.

So [1.0, NaN] == [1.0, NaN] is also False. Indeed, once a NaN occurs in any list, it cannot compare equal to any other list, even itself.

If you want to test a variable to see if it's NaN in numpy, you use the numpy.isnan() function. I don't see any obvious way of obtaining the comparison semantics that you seem to want other than by “manually” iterating over the list with a loop.

Consider the following:

import math
import numpy as np

def nan_eq(a, b):
    for i,j in zip(a,b):
        if i!=j and not (math.isnan(i) and math.isnan(j)):
            return False
    return True

a=[1.0, float('nan')]
b=[1.0, float('nan')]

print( float('nan')==float('nan') )
print( a==a )
print( a==b )
print( nan_eq(a,a) )

It will print:

False
True
False
True

The test a==a succeeds because, presumably, Python's idea that references to the same object are equal trumps what would be the result of the element-wise comparison that a==b requires.

like image 56
Emmet Avatar answered Oct 31 '22 03:10

Emmet


Since a and b are lists, a == b isn't returning an array, and so your numpy-like logic won't work:

>>> a == b
False

The command you've quoted only works if they're arrays:

>>> a,b = np.asarray(a), np.asarray(b)
>>> a == b
array([ True, False], dtype=bool)
>>> (a == b) | (np.isnan(a) & np.isnan(b))
array([ True,  True], dtype=bool)
>>> ((a == b) | (np.isnan(a) & np.isnan(b))).all()
True

which should work to compare two arrays (either they're both equal or they're both NaN).

like image 45
DSM Avatar answered Oct 31 '22 05:10

DSM