Selecting specific rows and columns from NumPy array

[Edit] The built-in method: `np.ix_`

I recently discovered that numpy gives you an in-built one-liner to doing exactly what @Jaime suggested, but without having to use broadcasting syntax (which suffers from lack of readability). From the docs:

Using ix_ one can quickly construct index arrays that will index the cross product. a[np.ix_([1,3],[2,5])] returns the array [[a[1,2] a[1,5]], [a[3,2] a[3,5]]].

So you use it like this:

>>> a = np.arange(20).reshape((5,4))
>>> a[np.ix_([0,1,3], [0,2])]
array([[ 0,  2],
       [ 4,  6],
       [12, 14]])

And the way it works is that it takes care of aligning arrays the way Jaime suggested, so that broadcasting happens properly:

>>> np.ix_([0,1,3], [0,2])
(array([[0],
        [1],
        [3]]), array([[0, 2]]))

Also, as MikeC says in a comment, np.ix_ has the advantage of returning a view, which my first (pre-edit) answer did not. This means you can now assign to the indexed array:

>>> a[np.ix_([0,1,3], [0,2])] = -1
>>> a    
array([[-1,  1, -1,  3],
       [-1,  5, -1,  7],
       [ 8,  9, 10, 11],
       [-1, 13, -1, 15],
       [16, 17, 18, 19]])

Fancy indexing requires you to provide all indices for each dimension. You are providing 3 indices for the first one, and only 2 for the second one, hence the error. You want to do something like this:

>>> a[[[0, 0], [1, 1], [3, 3]], [[0,2], [0,2], [0, 2]]]
array([[ 0,  2],
       [ 4,  6],
       [12, 14]])

That is of course a pain to write, so you can let broadcasting help you:

>>> a[[[0], [1], [3]], [0, 2]]
array([[ 0,  2],
       [ 4,  6],
       [12, 14]])

This is much simpler to do if you index with arrays, not lists:

>>> row_idx = np.array([0, 1, 3])
>>> col_idx = np.array([0, 2])
>>> a[row_idx[:, None], col_idx]
array([[ 0,  2],
       [ 4,  6],
       [12, 14]])

USE:

 >>> a[[0,1,3]][:,[0,2]]
array([[ 0,  2],
   [ 4,  6],
   [12, 14]])

OR:

>>> a[[0,1,3],::2]
array([[ 0,  2],
   [ 4,  6],
   [12, 14]])

Using np.ix_ is the most convenient way to do it (as answered by others), but it also can be done as follows:

>>> rows = [0, 1, 3]
>>> cols = [0, 2]

>>> (a[rows].T)[cols].T

array([[ 0,  2],
       [ 4,  6],
       [12, 14]])

Related questions
                            
                                Login credentials not working with Gmail SMTP
                            
                                Check to see if python script is running
                            
                                How to Install pip for python 3.7 on Ubuntu 18?
                            
                                How do I calculate r-squared using Python and Numpy?
                            
                                How do I send a POST request as a JSON?
                            
                                Python: How to determine the language?
                            
                                How to clamp an integer to some range?
                            
                                Cannot display HTML string
                            
                                What is the difference between using loc and using just square brackets to filter for columns in Pandas/Python?
                            
                                Project structure for Google App Engine
                            
                                Flake8: Ignore specific warning for entire file
                            
                                while (1) vs. while(True) -- Why is there a difference (in python 2 bytecode)?
                            
                                What does 'killed' mean when a processing of a huge CSV with Python, which suddenly stops?
                            
                                Nested classes' scope?
                            
                                Numpy: find first index of value fast
                            
                                How to turn on line numbers in IDLE?
                            
                                Is there any way to show the dependency trees for pip packages?
                            
                                Filter by property
                            
                                Python PIP Install throws TypeError: unsupported operand type(s) for -=: 'Retry' and 'int'
                            
                                How can I filter lines on load in Pandas read_csv function?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Selecting specific rows and columns from NumPy array

Tags:

python

arrays

multidimensional-array

numpy

numpy-slicing

People also ask

[Edit] The built-in method: `np.ix_`

Recent Activity

Donate For Us

Selecting specific rows and columns from NumPy array

Tags:

python

arrays

multidimensional-array

numpy

numpy-slicing

People also ask

[Edit] The built-in method: np.ix_

Related questions

Recent Activity

Donate For Us

[Edit] The built-in method: `np.ix_`