Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Find matching rows in 2 dimensional numpy array

I would like to get the index of a 2 dimensional Numpy array that matches a row. For example, my array is this:

vals = np.array([[0, 0],
                 [1, 0],
                 [2, 0],
                 [0, 1],
                 [1, 1],
                 [2, 1],
                 [0, 2],
                 [1, 2],
                 [2, 2],
                 [0, 3],
                 [1, 3],
                 [2, 3],
                 [0, 0],
                 [1, 0],
                 [2, 0],
                 [0, 1],
                 [1, 1],
                 [2, 1],
                 [0, 2],
                 [1, 2],
                 [2, 2],
                 [0, 3],
                 [1, 3],
                 [2, 3]])

I would like to get the index that matches the row [0, 1] which is index 3 and 15. When I do something like numpy.where(vals == [0 ,1]) I get...

(array([ 0,  3,  3,  4,  5,  6,  9, 12, 15, 15, 16, 17, 18, 21]), array([0, 0, 1, 1, 1, 0, 0, 0, 0, 1, 1, 1, 0, 0]))

I want index array([3, 15]).

like image 300
b10hazard Avatar asked Sep 13 '14 13:09

b10hazard


People also ask

How to access any row of a multidimensional array in NumPy?

In NumPy , it is very easy to access any rows of a multidimensional array. All we need to do is Slicing the array according to the given conditions.

What is a 2D array in NumPy?

In this we are specifically going to talk about 2D arrays. 2D Array can be defined as array of an array. 2D array are also called as Matrices which can be represented as collection of rows and columns. In this article, we have explored 2D array in Numpy in Python.

How to find indexes of rows that match exactly between two arrays?

How can I find the indexes of the rows that match exactly between two numpy arrays. For example: x = np.array ( ( [0,1], [1,0], [0,0])) y = np.array ( ( [0,1], [1,1], [0,0])) Show activity on this post. Show activity on this post. This works for every pair of numpy arrays if the length is the same:

How to apply the filter by the condition in NumPy two-dimensional array?

Now let see some example for applying the filter by the given condition in NumPy two-dimensional array. In this example, we are using the np.asarray () method which is explained below: arr : [array_like] Input data, in any form that can be converted to an array.


3 Answers

You need the np.where function to get the indexes:

>>> np.where((vals == (0, 1)).all(axis=1))
(array([ 3, 15]),)

Or, as the documentation states:

If only condition is given, return condition.nonzero()

You could directly call .nonzero() on the array returned by .all:

>>> (vals == (0, 1)).all(axis=1).nonzero()
(array([ 3, 15]),)

To dissassemble that:

>>> vals == (0, 1)
array([[ True, False],
       [False, False],
       ...
       [ True, False],
       [False, False],
       [False, False]], dtype=bool)

and calling the .all method on that array (with axis=1) gives you True where both are True:

>>> (vals == (0, 1)).all(axis=1)
array([False, False, False,  True, False, False, False, False, False,
       False, False, False, False, False, False,  True, False, False,
       False, False, False, False, False, False], dtype=bool)

and to get which indexes are True:

>>> np.where((vals == (0, 1)).all(axis=1))
(array([ 3, 15]),)

or

>>> (vals == (0, 1)).all(axis=1).nonzero()
(array([ 3, 15]),)

I find my solution a bit more readable, but as unutbu points out, the following may be faster, and returns the same value as (vals == (0, 1)).all(axis=1):

>>> (vals[:, 0] == 0) & (vals[:, 1] == 1)
like image 142
Russia Must Remove Putin Avatar answered Oct 18 '22 05:10

Russia Must Remove Putin


In [5]: np.where((vals[:,0] == 0) & (vals[:,1]==1))[0]
Out[5]: array([ 3, 15])

I'm not sure why, but this is significantly faster than
np.where((vals == (0, 1)).all(axis=1)):

In [34]: vals2 = np.tile(vals, (1000,1))

In [35]: %timeit np.where((vals2 == (0, 1)).all(axis=1))[0]
1000 loops, best of 3: 808 µs per loop

In [36]: %timeit np.where((vals2[:,0] == 0) & (vals2[:,1]==1))[0]
10000 loops, best of 3: 152 µs per loop
like image 16
unutbu Avatar answered Oct 18 '22 03:10

unutbu


Using the numpy_indexed package, you can simply write:

import numpy_indexed as npi
print(np.flatnonzero(npi.contains([[0, 1]], vals)))
like image 3
Eelco Hoogendoorn Avatar answered Oct 18 '22 05:10

Eelco Hoogendoorn