Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Using np.where to find matching row in 2D array

I would like to know how I use np.where with 2D array

I have the following array:

arr1 = np.array([[ 3.,  0.],
                 [ 3.,  1.],
                 [ 3.,  2.],
                 [ 3.,  3.],
                 [ 3.,  6.],
                 [ 3.,  5.]])

I want to find this array:

arr2 = np.array([3.,0.])

But when I use np.where():

np.where(arr1 == arr2)

It returns:

(array([0, 0, 1, 2, 3, 4, 5]), array([0, 1, 0, 0, 0, 0, 0]))

I can't understand what it means. Can someone explain this for me?

like image 527
dcalmeida Avatar asked Feb 02 '17 01:02

dcalmeida


1 Answers

You probably wanted all rows that are equal to your arr2:

>>> np.where(np.all(arr1 == arr2, axis=1))
(array([0], dtype=int64),)

Which means that the first row (zeroth index) matched.


The problem with your approach is that numpy broadcasts the arrays (visualized with np.broadcast_arrays):

>>> arr1_tmp, arr2_tmp = np.broadcast_arrays(arr1, arr2)
>>> arr2_tmp
array([[ 3.,  0.],
       [ 3.,  0.],
       [ 3.,  0.],
       [ 3.,  0.],
       [ 3.,  0.],
       [ 3.,  0.]]) 

and then does elementwise-comparison:

>>> arr1 == arr2
array([[ True,  True],
       [ True, False],
       [ True, False],
       [ True, False],
       [ True, False],
       [ True, False]], dtype=bool)

and np.where then gives you the coordinates of every True:

>>> np.where(arr1 == arr2)
(array([0, 0, 1, 2, 3, 4, 5], dtype=int64),
 array([0, 1, 0, 0, 0, 0, 0], dtype=int64))
#       ^---- first match (0, 0)
#          ^--- second match (0, 1)
#             ^--- third match (1, 0)
#  ...

Which means (0, 0) (first row left item) is the first True, then 0, 1 (first row right item), then 1, 0 (second row, left item), ....


If you use np.all along the first axis you get all rows that are completly equal:

>>> np.all(arr1 == arr2, axis=1)
array([ True, False, False, False, False, False], dtype=bool)

Can be better visualized if one keeps the dimensions:

>>> np.all(arr1 == arr2, axis=1, keepdims=True)
array([[ True],
       [False],
       [False],
       [False],
       [False],
       [False]], dtype=bool)
like image 88
MSeifert Avatar answered Oct 01 '22 04:10

MSeifert