I want to get the indices of the intersecting rows of a main numpy 2d array A, with another one B.
A=array([[1, 2],
[3, 4],
[5, 6],
[7, 8],
[9, 10]])
B=array([[1, 4],
[1, 2],
[5, 6],
[6, 3]])
result=[0,2]
Where this should return [0,2] based on the indices of array A.
How can this be done efficiently for 2d arrays?
Thank you!
edit
I have tried the function:
k[np.in1d(k.view(dtype='i,i').reshape(k.shape[0]),k2.view(dtype='i,i').
reshape(k2.shape[0]))]
from Implementation of numpy in1d for 2D arrays? but I get a reshape error. My datatype is floats (with two decimals). Moreover, I also tried with sets but the performance is quite slow.
With minimal changes, you can get your approach to work:
In [15]: A
Out[15]:
array([[ 1, 2],
[ 3, 4],
[ 5, 6],
[ 7, 8],
[ 9, 10]])
In [16]: B
Out[16]:
array([[1, 4],
[1, 2],
[5, 6],
[6, 3]])
In [17]: np.in1d(A.view('i,i').reshape(-1), B.view('i,i').reshape(-1))
Out[17]: array([ True, False, True, False, False], dtype=bool)
In [18]: np.nonzero(np.in1d(A.view('i,i').reshape(-1), B.view('i,i').reshape(-1)))
Out[18]: (array([0, 2], dtype=int64),)
In [19]: np.nonzero(np.in1d(A.view('i,i').reshape(-1), B.view('i,i').reshape(-1)))[0]
Out[19]: array([0, 2], dtype=int64)
If your arrays are not floats, and are both contiguous, then the following will be faster:
In [21]: dt = np.dtype((np.void, A.dtype.itemsize * A.shape[1]))
In [22]: np.nonzero(np.in1d(A.view(dt).reshape(-1), B.view(dt).reshape(-1)))[0]
Out[22]: array([0, 2], dtype=int64)
And a quick timing:
In [24]: %timeit np.nonzero(np.in1d(A.view('i,i').reshape(-1), B.view('i,i').reshape(-1)))[0]
10000 loops, best of 3: 75 µs per loop
In [25]: %timeit np.nonzero(np.in1d(A.view(dt).reshape(-1), B.view(dt).reshape(-1)))[0]
10000 loops, best of 3: 29.8 µs per loop
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With