Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Shorter version of this numpy array indexing

I have the following code in python (numpy array or scipy.sparse.matrices), it works:

X[a,:][:,b]

But it doesn't look elegant. 'a' and 'b' are 1-D boolean mask.

'a' has the same length as X.shape[0] and 'b' has the same length as X.shape[1]

I tried X[a,b] but it doesn't work.

What I am trying to accomplish is to select particular rows and columns at the same time. For example, select row 0,7,8 then from that result select all rows from column 2,3,4

How would you make this shorter and more elegant?

like image 914
off99555 Avatar asked Jun 18 '16 14:06

off99555


1 Answers

You could use np.ix_ for such a broadcasted indexing, like so -

X[np.ix_(a,b)]

Though this won't be any shorter than the original code, but hopefully should be faster. This is because we are avoiding the intermediate output as with the original code that created X[a,:] with one slicing and then another slicing X[a,:][:,b] to give us the final output.

Also, this method would work for a and b as both int and boolean arrays.

Sample run

In [141]: X = np.random.randint(0,99,(6,5))

In [142]: m,n = X.shape

In [143]: a = np.in1d(np.arange(m),np.random.randint(0,m,(m)))

In [144]: b = np.in1d(np.arange(n),np.random.randint(0,n,(n)))

In [145]: X[a,:][:,b]
Out[145]: 
array([[17, 81, 64],
       [87, 16, 54],
       [98, 22, 11],
       [26, 54, 64]])

In [146]: X[np.ix_(a,b)]
Out[146]: 
array([[17, 81, 64],
       [87, 16, 54],
       [98, 22, 11],
       [26, 54, 64]])

Runtime test

In [147]: X = np.random.randint(0,99,(600,500))

In [148]: m,n = X.shape

In [149]: a = np.in1d(np.arange(m),np.random.randint(0,m,(m)))

In [150]: b = np.in1d(np.arange(n),np.random.randint(0,n,(n)))

In [151]: %timeit X[a,:][:,b]
1000 loops, best of 3: 1.74 ms per loop

In [152]: %timeit X[np.ix_(a,b)]
1000 loops, best of 3: 1.24 ms per loop
like image 167
Divakar Avatar answered Oct 25 '22 22:10

Divakar