I noticed some confusing behavior when indexing a flat numpy array with a list of tuples (using python 2.7.8 and numpy 1.9.1). My guess is that this is related to the maximum number of array dimensions (which I believe is 32), but I haven't been able to find the documentation.
>>> a = np.arange(100)
>>> tuple_index = [(i,) for i in a]
>>> a[tuple_index] # This works (but maybe it shouldn't)
>>> a[tuple_index[:32]] # This works too
>>> a[tuple_index[:31]] # This breaks for 2 <= i < 32
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
IndexError: too many indices for array
>>> a[tuple_index[:1]] # This also works...
Is the list of tuples is being "flattened" if it is 32 elements or larger? Is this documented somewhere?
The difference appears to be that the first examples trigger fancy indexing (which simply selects indices in a list from the same dimension) whereas tuple_index[:31]
is instead treated as an indexing tuple (which implies selection from multiple axes).
As you noted, the maximum number of dimensions for a NumPy array is (usually) 32:
>>> np.MAXDIMS
32
According to the following comment in the mapping.c file (which contains the code to interpret the index passed by the user), any sequence of tuples shorter than 32 is flattened to an indexing tuple:
/*
* Sequences < NPY_MAXDIMS with any slice objects
* or newaxis, Ellipsis or other arrays or sequences
* embedded, are considered equivalent to an indexing
* tuple. (`a[[[1,2], [3,4]]] == a[[1,2], [3,4]]`)
*/
(I haven't yet found a reference for this in the official documentation on the SciPy site.)
This makes a[tuple_index[:3]]
equivalent to a[(0,), (1,), (2,)]
, hence the "too many indices" error (because a
has only one dimension but we're implying there are three).
On the other hand, a[tuple_index]
is just the same as a[[(0,), (1,), (2,), ..., (99,)]]
resulting in the 2D array.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With