Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

In numpy, what does indexing an array with the empty tuple vs. ellipsis do?

I just discovered — by chance — that an array in numpy may be indexed by an empty tuple:

In [62]: a = arange(5)

In [63]: a[()]
Out[63]: array([0, 1, 2, 3, 4])

I found some documentation on the numpy wiki ZeroRankArray:

(Sasha) First, whatever choice is made for x[...] and x[()] they should be the same because ... is just syntactic sugar for "as many : as necessary", which in the case of zero rank leads to ... = (:,)*0 = (). Second, rank zero arrays and numpy scalar types are interchangeable within numpy, but numpy scalars can be use in some python constructs where ndarrays can't.

So, for 0-d arrays a[()] and a[...] are supposed to be equivalent. Are they for higher-dimensional arrays, too? They strongly appear to be:

In [65]: a = arange(25).reshape(5, 5)

In [66]: a[()] is a[...]
Out[66]: False

In [67]: (a[()] == a[...]).all()
Out[67]: True

In [68]: a = arange(3**7).reshape((3,)*7)

In [69]: (a[()] == a[...]).all()
Out[69]: True

But, it is not syntactic sugar. Not for a high-dimensional array, and not even for a 0-d array:

In [76]: a[()] is a
Out[76]: False

In [77]: a[...] is a
Out[77]: True

In [79]: b = array(0)

In [80]: b[()] is b
Out[80]: False

In [81]: b[...] is b
Out[81]: True

And then there is the case of indexing by an empty list, which does something else altogether, but appears equivalent to indexing with an empty ndarray:

In [78]: a[[]]
Out[78]: array([], shape=(0, 3, 3, 3, 3, 3, 3), dtype=int64)

In [86]: a[arange(0)]
Out[86]: array([], shape=(0, 3, 3, 3, 3, 3, 3), dtype=int64)

In [82]: b[[]]
---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)

IndexError: 0-d arrays can't be indexed.

So, it appears that () and ... are similar but not quite identical and indexing with [] means something else altogether. And a[] or b[] are SyntaxErrors. Indexing with lists is documented at index arrays, and there is a short notice about indexing with tuples at the end of the same document.

That leaves the question:

Is the difference between a[()] and a[...] by design? What is the design, then?

(Question somehow reminiscent of: What does the empty `()` do on a Matlab matrix?)

Edit:

In fact, even scalars may be indexed by an empty tuple:

In [36]: numpy.int64(10)[()]
Out[36]: 10
like image 643
gerrit Avatar asked Feb 04 '13 14:02

gerrit


People also ask

How does NumPy array indexing work?

Array indexing is the same as accessing an array element. You can access an array element by referring to its index number. The indexes in NumPy arrays start with 0, meaning that the first element has index 0, and the second has index 1 etc.

What is empty array in NumPy?

empty() in Python. The numpy module of Python provides a function called numpy. empty(). This function is used to create an array without initializing the entries of given shape and type.

What is negative indexing NumPy array?

Negative indices are interpreted as counting from the end of the array (i.e., if i < 0, it means n_i + i). All arrays generated by basic slicing are always views of the original array. The standard rules of sequence slicing apply to basic slicing on a per-dimension basis (including using a step index).

What is NumPy array explain with the help of indexing and slicing operations?

Numpy with Python Three types of indexing methods are available − field access, basic slicing and advanced indexing. Basic slicing is an extension of Python's basic concept of slicing to n dimensions. A Python slice object is constructed by giving start, stop, and step parameters to the built-in slice function.


Video Answer


1 Answers

The treatment of A[...] is a special case, optimised to always return A itself:

if (op == Py_Ellipsis) {
    Py_INCREF(self);
    return (PyObject *)self;
}

Anything else that should be equivalent e.g. A[:], A[(Ellipsis,)], A[()], A[(slice(None),) * A.ndim] will instead return a view of the entirety of A, whose base is A:

>>> A[()] is A
False
>>> A[()].base is A
True

This seems an unnecessary and premature optimisation, as A[(Ellipsis,)] and A[()] will always give the same result (an entire view on A). From looking at https://github.com/numpy/numpy/commit/fa547b80f7035da85f66f9cbabc4ff75969d23cd it seems that it was originally required because indexing with ... didn't work properly on 0d arrays (previously to https://github.com/numpy/numpy/commit/4156b241aa3670f923428d4e72577a9962cdf042 it would return the element as a scalar), then extended to all arrays for consistency; since then, indexing has been fixed on 0d arrays so the optimisation isn't required, but it's managed to stick around vestigially (and there's probably some code that depends on A[...] is A being true).

like image 143
ecatmur Avatar answered Oct 25 '22 11:10

ecatmur