If I run the following code:
dft1 = pd.DataFrame({'a':[1, np.nan, np.nan]})
dft2 = pd.DataFrame({'a':[1, 1, np.nan]})
dft1.a==dft2.a
The result is
0 True
1 False
2 False
Name: a, dtype: bool
How can I make the result to be
0 True
1 False
2 True
Name: a, dtype: bool
I.e., np.nan == np.nan evaluates to True.
I thought this is basic functionality and I must be asking a duplicate question, but I spent a lot of time search in SO or in Google and couldn't find it.
To check for NaN values in a Numpy array you can use the np. isnan() method. This outputs a boolean mask of the size that of the original array. The output array has true for the indices which are NaNs in the original array and false for the rest.
NaN is not equal to NaN! Short Story: According to IEEE 754 specifications any operation performed on NaN values should yield a false value or should raise an error.
In Python, NumPy NAN stands for not a number and is defined as a substitute for declaring value which are numerical values that are missing values in an array as NumPy is used to deal with arrays in Python and this can be initialized using numpy.
Can't think of a function that already does this for you (weird) so you can just do it yourself:
dft1.eq(dft2) | (dft1.isna() & dft2.isna())
a
0 True
1 False
2 True
Note the presence of the parentheses. Precedence is a thing to watch out for when working with overloaded bitwise operators in pandas.
Another option is to use np.nan_to_num
, if you are certain the index and columns of both DataFrames are identical so this result is valid:
np.nan_to_num(dft1) == np.nan_to_num(dft2)
array([[ True],
[False],
[ True]])
np.nan_to_num
fills NaNs with some filler value (0 for numeric, 'nan' for string arrays).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With