Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python: equality for Nan in a list?

I just want to figure out the logic behind these results:

>>>nan = float('nan')
>>>nan == nan
False 
# I understand that this is because the __eq__ method is defined this way
>>>nan in [nan]
True 
# This is because the __contains__ method for list is defined to compare the identity first then the content?

But in both cases I think behind the scene the function PyObject_RichCompareBool is called right? Why there is a difference? Shouldn't they have the same behaviour?

like image 229
Bob Fang Avatar asked Mar 17 '23 13:03

Bob Fang


1 Answers

But in both cases I think behind the scene the function PyObject_RichCompareBool is called right? Why there is a difference? Shouldn't they have the same behaviour?

== never calls PyObject_RichCompareBool on the float objects directly, floats have their own rich_compare method(called for __eq__) that may or may not call PyObject_RichCompareBool depending on the the arguments passed to it.

 /* Comparison is pretty much a nightmare.  When comparing float to float,
 * we do it as straightforwardly (and long-windedly) as conceivable, so
 * that, e.g., Python x == y delivers the same result as the platform
 * C x == y when x and/or y is a NaN.
 * When mixing float with an integer type, there's no good *uniform* approach.
 * Converting the double to an integer obviously doesn't work, since we
 * may lose info from fractional bits.  Converting the integer to a double
 * also has two failure modes:  (1) a long int may trigger overflow (too
 * large to fit in the dynamic range of a C double); (2) even a C long may have
 * more bits than fit in a C double (e.g., on a a 64-bit box long may have
 * 63 bits of precision, but a C double probably has only 53), and then
 * we can falsely claim equality when low-order integer bits are lost by
 * coercion to double.  So this part is painful too.
 */

static PyObject*
float_richcompare(PyObject *v, PyObject *w, int op)
{
    double i, j;
    int r = 0;

    assert(PyFloat_Check(v));
    i = PyFloat_AS_DOUBLE(v);

    /* Switch on the type of w.  Set i and j to doubles to be compared,
     * and op to the richcomp to use.
     */
    if (PyFloat_Check(w))
        j = PyFloat_AS_DOUBLE(w);

    else if (!Py_IS_FINITE(i)) {
        if (PyInt_Check(w) || PyLong_Check(w))
            /* If i is an infinity, its magnitude exceeds any
             * finite integer, so it doesn't matter which int we
             * compare i with.  If i is a NaN, similarly.
             */
            j = 0.0;
        else
            goto Unimplemented;
    }
...

On the other hand the list_contains directly calls PyObject_RichCompareBool on the items hence you get True in the second case.


Note that this is true only for CPython, PyPy's list.__contains__ method only seems to be comparing the items by calling their __eq__ method:

$~/pypy-2.4.0-linux64/bin# ./pypy
Python 2.7.8 (f5dcc2477b97, Sep 18 2014, 11:33:30)
[PyPy 2.4.0 with GCC 4.6.3] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>>> nan = float('nan')
>>>> nan == nan
False
>>>> nan is nan
True
>>>> nan in [nan]
False
like image 190
Ashwini Chaudhary Avatar answered Mar 27 '23 02:03

Ashwini Chaudhary