Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

numpy array indexing with lists and arrays

I have:

>>> a
array([[1, 2],
       [3, 4]])

>>> type(l), l # list of scalers
(<type 'list'>, [0, 1])

>>> type(i), i # a numpy array
(<type 'numpy.ndarray'>, array([0, 1]))

>>> type(j), j # list of numpy arrays
(<type 'list'>, [array([0, 1]), array([0, 1])])

When I do

>>> a[l] # Case 1, l is a list of scalers

I get

array([[1, 2],
       [3, 4]])

which means indexing happened only on 0th axis.

But when I do

>>> a[j] # Case 2, j is a list of numpy arrays

I get

array([1, 4])

which means indexing happened along axis 0 and axis 1.

Q1: When used for indexing, why is there a difference in treatment of list of scalers and list of numpy arrays ? (Case 1 vs Case 2). In Case 2, I was hoping to see indexing happen only along axis 0 and get

array( [[[1,2],
          [3,4]], 

        [[1,2],
         [3,4]]])

Now, when using numpy array of arrays instead

>>> j1 = np.array(j) # numpy array of arrays

The result below indicates that indexing happened only along axis 0 (as expected)

>>> a[j1] Case 3, j1 is a numpy array of numpy arrays
array([[[1, 2],
        [3, 4]],

       [[1, 2],
        [3, 4]]])

Q2: When used for indexing, why is there a difference in treatment of list of numpy arrays and numpy array of numpy arrays? (Case 2 vs Case 3)

like image 902
Ankur Agarwal Avatar asked Dec 15 '17 06:12

Ankur Agarwal


1 Answers

Case1, a[l] is actually a[(l,)] which expands to a[(l, slice(None))]. That is, indexing the first dimension with the list l, and an automatic trailing : slice. Indices are passed as a tuple to the array __getitem__, and extra () may be added without confusion.

Case2, a[j] is treated as a[array([0, 1]), array([0, 1]] or a[(array(([0, 1]), array([0, 1])]. In other words, as a tuple of indexing objects, one per dimension. It ends up returning a[0,0] and a[1,1].

Case3, a[j1] is a[(j1, slice(None))], applying the j1 index to just the first dimension.

Case2 is a bit of any anomaly. Your intuition is valid, but for historical reasons, this list of arrays (or list of lists) is interpreted as a tuple of arrays.

This has been discussed in other SO questions, and I think it is documented. But off hand I can't find those references.

So it's safer to use either a tuple of indexing objects, or an array. Indexing with a list has a potential ambiguity.


numpy array indexing: list index and np.array index give different result

This SO question touches on the same issue, though the clearest statement of what is happening is buried in a code link in a comment by @user2357112.

Another way of forcing the Case3 like indexing, make the 2nd dimension slice explicit, a[j,:]

In [166]: a[j]
Out[166]: array([1, 4])
In [167]: a[j,:]
Out[167]: 
array([[[1, 2],
        [3, 4]],

       [[1, 2],
        [3, 4]]])

(I often include the trailing : even if it isn't needed. It makes it clear to me, and readers, how many dimensions we are working with.)

like image 196
hpaulj Avatar answered Sep 29 '22 12:09

hpaulj