Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Unexpected behaviour when indexing a 2D np.array with two boolean arrays

two_d = np.array([[ 0,  1,  2,  3,  4],
                  [ 5,  6,  7,  8,  9],
                  [10, 11, 12, 13, 14],
                  [15, 16, 17, 18, 19],
                  [20, 21, 22, 23, 24]])

first = np.array((True, True, False, False, False))
second = np.array((False, False, False, True, True))

Now, when I enter:

two_d[first, second]

I get:

array([3,9])

which doesn't make a whole lot of sense to me. Can anybody explain that simply?

like image 388
Ted Avatar asked Jan 19 '16 12:01

Ted


People also ask

Can you index a NumPy array?

Array indexing is the same as accessing an array element. You can access an array element by referring to its index number. The indexes in NumPy arrays start with 0, meaning that the first element has index 0, and the second has index 1 etc.

What is negative indexing NumPy array?

Negative indices are interpreted as counting from the end of the array (i.e., if n i < 0 , it means n i + d i ).

What is fancy indexing in Python?

Fancy indexing is conceptually simple: it means passing an array of indices to access multiple array elements at once. For example, consider the following array: import numpy as np rand = np. random. RandomState(42) x = rand.

How do you find the index of an element in a NumPy array?

Using ndenumerate() function to find the Index of value It is usually used to find the first occurrence of the element in the given numpy array.


2 Answers

When given multiple boolean arrays to index with, NumPy pairs up the indices of the True values. The first true value in first in paired with the first true value in second, and so on. NumPy then fetches the elements at each of these (x, y) indices.

This means that two_d[first, second] is equivalent to:

two_d[[0, 1], [3, 4]]

In other words you're retrieving the values at index (0, 3) and index (1, 4); 3 and 9. Note that if the two arrays had different numbers of true values an error would be raised!

The documents on advanced indexing mention this behaviour briefly and suggest np.ix_ as a 'less surprising' alternative:

Combining multiple Boolean indexing arrays or a Boolean with an integer indexing array can best be understood with the obj.nonzero() analogy. The function ix_ also supports boolean arrays and will work without any surprises.

Hence you may be looking for:

>>> two_d[np.ix_(first, second)]
array([[3, 4],
       [8, 9]])
like image 155
Alex Riley Avatar answered Sep 28 '22 05:09

Alex Riley


Check the documentation on boolean indexing.

two_d[first, second] is the same as
two_d[first.nonzero(), second.nonzero()], where:

>>> first.nonzero()
(array([0, 1]),)
>>> second.nonzero()
(array([3, 4]),)

Used as indices, this will select 3 and 9 because

>>> two_d[0,3]
3
>>> two_d[1,4]
9

and

>>> two_d[[0,1],[3,4]]
array([3, 9])

Also mildy related: NumPy indexing using List?

like image 29
timgeb Avatar answered Sep 28 '22 03:09

timgeb