I have:
>>> a
array([[1, 2],
[3, 4]])
>>> type(l), l # list of scalers
(<type 'list'>, [0, 1])
>>> type(i), i # a numpy array
(<type 'numpy.ndarray'>, array([0, 1]))
>>> type(j), j # list of numpy arrays
(<type 'list'>, [array([0, 1]), array([0, 1])])
When I do
>>> a[l] # Case 1, l is a list of scalers
I get
array([[1, 2],
[3, 4]])
which means indexing happened only on 0th axis.
But when I do
>>> a[j] # Case 2, j is a list of numpy arrays
I get
array([1, 4])
which means indexing happened along axis 0 and axis 1.
Q1: When used for indexing, why is there a difference in treatment of list of scalers and list of numpy arrays ? (Case 1 vs Case 2). In Case 2, I was hoping to see indexing happen only along axis 0 and get
array( [[[1,2],
[3,4]],
[[1,2],
[3,4]]])
Now, when using numpy array of arrays instead
>>> j1 = np.array(j) # numpy array of arrays
The result below indicates that indexing happened only along axis 0 (as expected)
>>> a[j1] Case 3, j1 is a numpy array of numpy arrays
array([[[1, 2],
[3, 4]],
[[1, 2],
[3, 4]]])
Q2: When used for indexing, why is there a difference in treatment of list of numpy arrays and numpy array of numpy arrays? (Case 2 vs Case 3)
Case1, a[l]
is actually a[(l,)]
which expands to a[(l, slice(None))]
. That is, indexing the first dimension with the list l
, and an automatic trailing :
slice. Indices are passed as a tuple to the array __getitem__
, and extra ()
may be added without confusion.
Case2, a[j]
is treated as a[array([0, 1]), array([0, 1]]
or a[(array(([0, 1]), array([0, 1])]
. In other words, as a tuple of indexing objects, one per dimension. It ends up returning a[0,0]
and a[1,1]
.
Case3, a[j1]
is a[(j1, slice(None))]
, applying the j1
index to just the first dimension.
Case2 is a bit of any anomaly. Your intuition is valid, but for historical reasons, this list of arrays (or list of lists) is interpreted as a tuple of arrays.
This has been discussed in other SO questions, and I think it is documented. But off hand I can't find those references.
So it's safer to use either a tuple of indexing objects, or an array. Indexing with a list has a potential ambiguity.
numpy array indexing: list index and np.array index give different result
This SO question touches on the same issue, though the clearest statement of what is happening is buried in a code link in a comment by @user2357112.
Another way of forcing the Case3 like indexing, make the 2nd dimension slice explicit, a[j,:]
In [166]: a[j]
Out[166]: array([1, 4])
In [167]: a[j,:]
Out[167]:
array([[[1, 2],
[3, 4]],
[[1, 2],
[3, 4]]])
(I often include the trailing :
even if it isn't needed. It makes it clear to me, and readers, how many dimensions we are working with.)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With