Understanding weird boolean 2d-array indexing behavior in numpy

Question

Why does this work:

a=np.random.rand(10,20)
x_range=np.arange(10)
y_range=np.arange(20)

a_tmp=a[x_range<5,:]
b=a_tmp[:,np.in1d(y_range,[3,4,8])]

and this does not:

a=np.random.rand(10,20)
x_range=np.arange(10)
y_range=np.arange(20)    

b=a[x_range<5,np.in1d(y_range,[3,4,8])]

pv. · Accepted Answer

The Numpy reference documentation's page on indexing contains the answers, but requires a bit of careful reading.

The answer here is that indexing with booleans is equivalent to indexing with integer arrays obtained by first transforming the boolean arrays with np.nonzero. Therefore, with boolean arrays m1, m2

a[m1, m2] == a[m1.nonzero(), m2.nonzero()]

which (when it succeeds, i.e., m1.nonzero().shape == m2.nonzero().shape) is equivalent to:

[a[i, i] for i in range(a.shape[0]) if m1[i] and m2[i]]

I'm not sure why it was designed to work like this --- usually, this is not what you'd want.

To get the more intuitive result, you can instead do

a[np.ix_(m1, m2)]

which produces a result equivalent to

[[a[i,j] for j in range(a.shape[1]) if m2[j]] for i in range(a.shape[0]) if m1[i]]

Understanding weird boolean 2d-array indexing behavior in numpy

Tags:

python

numpy

tillsten

1 Answers

pv.

Recent Activity

Donate For Us

Understanding weird boolean 2d-array indexing behavior in numpy

Tags:

python

numpy

tillsten

1 Answers

pv.

Related questions

Recent Activity

Donate For Us