Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

appending numpy array with booleans

Tags:

python

numpy

Can someone explain what this code is doing?

   a = np.array([[1, 2], [3, 4]])
   a[..., [True, False]]

What is the [True, False] doing there?

like image 825
JRR Avatar asked Nov 09 '22 13:11

JRR


1 Answers

Ellipsis Notation and Booleans as Integers

From the numpy docs:

Ellipsis expand to the number of : objects needed to make a selection tuple of the same length as x.ndim. There may only be a single ellipsis present

True and False are just obfuscated 0 and 1. Taking the example from the docs:

x = np.array([[[1],[2],[3]], [[4],[5],[6]]])
x[...,0]
# outputs: array([[1, 2, 3],
#       [4, 5, 6]])
x[..., False] # same thing

The boolean values are specifying an index, just like the numbers 0 or 1 would.


In response to your question in the comments

It first seems magical that

a = np.array([[1, 2], [3, 4]])
a[..., [True, True]]  # = [[2,2],[4,4]]

But when we consider it as

a[..., [1,1]] # = [[2,2],[4,4]]

It seems less impressive.

Similarly:

b = array([[1,2,3],[4,5,6]])
b[...,[2,2]] # = [[3,3],[5,5]]

After applying the ellipsis rules; the true and false grab column indices, just like 0, 1, or 17 would have


Boolean Arrays for Complex Indexing

There are some subtle differences (bool's have a different type than ints). A lot of the hairy details can be found here. These do not seem to have any roll in your code, but they are interesting in figuring out how numpy indexing works.

In particular, this line is probably what you're looking for:

In the future Boolean array-likes (such as lists of python bools) will always be treated as Boolean indexes

On this page, they talk about boolean arrays, which are quite complex as an indexing tool

Boolean arrays used as indices are treated in a different manner entirely than index arrays. Boolean arrays must be of the same shape as the initial dimensions of the array being indexed

Skipping down a bit

Unlike in the case of integer index arrays, in the boolean case, the result is a 1-D array containing all the elements in the indexed array corresponding to all the true elements in the boolean array. The elements in the indexed array are always iterated and returned in row-major (C-style) order. The result is also identical to y[np.nonzero(b)]. As with index arrays, what is returned is a copy of the data, not a view as one gets with slices.

like image 58
en_Knight Avatar answered Nov 14 '22 21:11

en_Knight