If I run the following code: <pre class="prettyprint"><code>dft1 = pd.DataFrame({'a':[1, np.nan, np.nan]}) dft2 = pd.DataFrame({'a':[1, 1, np.nan]}) dft1.a==dft2.a </code></pre> The result is <pre class="prettyprint"><code>0 True 1 False 2 False Name: a, dtype: bool </code></pre> How can I make the result to be <pre class="prettyprint"><code>0 True 1 False 2 True Name: a, dtype: bool </code></pre> I.e., np.nan == np.nan evaluates to True. I thought this is basic functionality and I must be asking a duplicate question, but I spent a lot of time search in SO or in Google and couldn't find it.

Can't think of a function that already does this for you (weird) so you can just do it yourself: <pre class="prettyprint"><code>dft1.eq(dft2) | (dft1.isna() & dft2.isna()) a 0 True 1 False 2 True </code></pre> Note the presence of the parentheses. Precedence is a thing to watch out for when working with overloaded bitwise operators in pandas. Another option is to use <code>np.nan_to_num</code>, if you are certain the index and columns of both DataFrames are identical so this result is valid: <pre class="prettyprint"><code>np.nan_to_num(dft1) == np.nan_to_num(dft2) array([[ True], [False], [ True]]) </code></pre> <code>np.nan_to_num</code> fills NaNs with some filler value (0 for numeric, 'nan' for string arrays).

Element-wise comparison with NaNs as equal

Tags:

python

pandas

dataframe

nan

numpy

If I run the following code:

dft1 = pd.DataFrame({'a':[1, np.nan, np.nan]})
dft2 = pd.DataFrame({'a':[1, 1, np.nan]})
dft1.a==dft2.a

The result is

0     True
1    False
2    False
Name: a, dtype: bool

How can I make the result to be

0     True
1    False
2     True
Name: a, dtype: bool

I.e., np.nan == np.nan evaluates to True.

I thought this is basic functionality and I must be asking a duplicate question, but I spent a lot of time search in SO or in Google and couldn't find it.

767

asked Aug 30 '18 18:08

GoCurry

1 Answers

Can't think of a function that already does this for you (weird) so you can just do it yourself:

dft1.eq(dft2) | (dft1.isna() & dft2.isna())

       a
0   True
1  False
2   True

Note the presence of the parentheses. Precedence is a thing to watch out for when working with overloaded bitwise operators in pandas.

Another option is to use np.nan_to_num, if you are certain the index and columns of both DataFrames are identical so this result is valid:

np.nan_to_num(dft1) == np.nan_to_num(dft2)

array([[ True],
       [False],
       [ True]])

np.nan_to_num fills NaNs with some filler value (0 for numeric, 'nan' for string arrays).

125

answered Oct 25 '22 20:10

cs95

Related questions
                            
                                Destroying a Singleton object in Python
                            
                                understanding matplotlib.subplots python [duplicate]
                            
                                Pandas DataFrame mutability
                            
                                How to do zero padding in keras conv layer?
                            
                                python installing package with submodules
                            
                                OSMNx : get coordinates of nodes using OSM id
                            
                                Finding equal values from a list of list of tuples in Python
                            
                                Matplotlib savefig() over multiple graphs keeps saving the same graph
                            
                                prefetch_related for Authenticated user
                            
                                Django: Read uploaded CSV file using FileField instance
                            
                                difference between str(dict) and json.dumps(dict)
                            
                                Creating a mixture of probability distributions for sampling
                            
                                keras bidirectional lstm seq2seq
                            
                                updated object's attribute in python class, but not getting reflected
                            
                                fit-transform on training data and transform on test data [duplicate]
                            
                                Using Apply in Pandas Lambda functions with multiple if statements
                            
                                Multiple sets of duplicate records from a pandas dataframe
                            
                                Encode numpy array using uncompressed RLE for COCO dataset
                            
                                When does Python check whether a concrete subclass of an ABC implements the required methods?
                            
                                How to group by and aggregate on multiple columns in pandas

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With