Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How does __contains__ work for ndarrays?

Tags:

>>> x = numpy.array([[1, 2],
...                  [3, 4],
...                  [5, 6]])
>>> [1, 7] in x
True
>>> [1, 2] in x
True
>>> [1, 6] in x
True
>>> [2, 6] in x
True
>>> [3, 6] in x
True
>>> [2, 3] in x
False
>>> [2, 1] in x
False
>>> [1, 2, 3] in x
False
>>> [1, 3, 5] in x
False

I have no idea how __contains__ works for ndarrays. I couldn't find the relevant documentation when I looked for it. How does it work? And is it documented anywhere?

like image 212
user2357112 supports Monica Avatar asked Aug 19 '13 18:08

user2357112 supports Monica


People also ask

How is Ndarray defined?

An ndarray is a (usually fixed-size) multidimensional container of items of the same type and size. The number of dimensions and items in an array is defined by its shape , which is a tuple of N non-negative integers that specify the sizes of each dimension.

When using NumPy in python how do you check the dimensionality?

ndim to get the number of dimensions. Alternatively, we can use shape attribute to get the size of each dimension and then use len() function for the number of dimensions. Use numpy. array() function to convert a list to numpy array and use one of the above two ways to get the number of dimensions.


2 Answers

I found the source for ndarray.__contains__, in numpy/core/src/multiarray/sequence.c. As a comment in the source states,

thing in x

is equivalent to

(x == thing).any()

for an ndarray x, regardless of the dimensions of x and thing. This only makes sense when thing is a scalar; the results of broadcasting when thing isn't a scalar cause the weird results I observed, as well as oddities like array([1, 2, 3]) in array(1) that I didn't think to try. The exact source is

static int
array_contains(PyArrayObject *self, PyObject *el)
{
    /* equivalent to (self == el).any() */

    int ret;
    PyObject *res, *any;

    res = PyArray_EnsureAnyArray(PyObject_RichCompare((PyObject *)self,
                                                      el, Py_EQ));
    if (res == NULL) {
        return -1;
    }
    any = PyArray_Any((PyArrayObject *)res, NPY_MAXDIMS, NULL);
    Py_DECREF(res);
    ret = PyObject_IsTrue(any);
    Py_DECREF(any);
    return ret;
}
like image 141
user2357112 supports Monica Avatar answered Oct 18 '22 14:10

user2357112 supports Monica


Seems like numpy's __contains__ is doing something like this for a 2-d case:

def __contains__(self, item):
    for row in self:
        if any(item_value == row_value for item_value, row_value in zip(item, row)):
            return True
    return False

[1,7] works because the 0th element of the first row matches the 0th element of [1,7]. Same with [1,2] etc. With [2,6], the 6 matches the 6 in the last row. With [2,3], none of the elements match a row at the same index. [1, 2, 3] is trivial since the shapes don't match.

See this for more, and also this ticket.

like image 23
Alok Singhal Avatar answered Oct 18 '22 14:10

Alok Singhal