how do I get a row-wise comparison between two arrays, in the result of a row-wise true/false array?
Given datas:
a = np.array([[1,0],[2,0],[3,1],[4,2]])
b = np.array([[1,0],[2,0],[4,2]])
Result step 1:
c = np.array([True, True,False,True])
Result final:
a = a[c]
So how do I get the array c
????
P.S.: In this example the arrays a
and b
are sorted, please give also information if in your solution it is important that the arrays are sorted
Here's a vectorised solution:
res = (a[:, None] == b).all(-1).any(-1)
print(res)
array([ True, True, False, True])
Note that a[:, None] == b
compares each row of a
with b
element-wise. We then use all
+ any
to deduce if there are any rows which are all True
for each sub-array:
print(a[:, None] == b)
[[[ True True]
[False True]
[False False]]
[[False True]
[ True True]
[False False]]
[[False False]
[False False]
[False False]]
[[False False]
[False False]
[ True True]]]
Approach #1
We could use a view
based vectorized solution -
# https://stackoverflow.com/a/45313353/ @Divakar
def view1D(a, b): # a, b are arrays
a = np.ascontiguousarray(a)
b = np.ascontiguousarray(b)
void_dt = np.dtype((np.void, a.dtype.itemsize * a.shape[1]))
return a.view(void_dt).ravel(), b.view(void_dt).ravel()
A,B = view1D(a,b)
out = np.isin(A,B)
Sample run -
In [8]: a
Out[8]:
array([[1, 0],
[2, 0],
[3, 1],
[4, 2]])
In [9]: b
Out[9]:
array([[1, 0],
[2, 0],
[4, 2]])
In [10]: A,B = view1D(a,b)
In [11]: np.isin(A,B)
Out[11]: array([ True, True, False, True])
Approach #2
Alternatively for the case when all rows in b
are in a
and rows are lexicographically sorted, using the same views
, but with searchsorted
-
out = np.zeros(len(A), dtype=bool)
out[np.searchsorted(A,B)] = 1
If the rows are not necessarily lexicographically sorted -
sidx = A.argsort()
out[sidx[np.searchsorted(A,B,sorter=sidx)]] = 1
you can use numpy with apply_along_axis (kind of iteration on specific axis while axis=0 iterate on every cell, axis=1 iterate on every row, axis=2 on matrix and so on
import numpy as np
a = np.array([[1,0],[2,0],[3,1],[4,2]])
b = np.array([[1,0],[2,0],[4,2]])
c = np.apply_along_axis(lambda x,y: x in y, 1, a, b)
You can do it as a list comp via:
c = np.array([row in b for row in a])
though this approach will be slower than a pure numpy approach (if it exists).
a = np.array([[1,0],[2,0],[3,1],[4,2]])
b = np.array([[1,0],[2,0],[4,2]])
i = 0
j = 0
result = []
We can take advantage of the fact that they are sorted and do this in O(n) time. Using two pointers we just move ahead the pointer that has gotten behind:
while i < len(a) and j < len(b):
if tuple(a[i])== tuple(b[j]):
result.append(True)
i += 1
j += 1 # get rid of this depending on how you want to handle duplicates
elif tuple(a[i]) > tuple(b[j]):
j += 1
else:
result.append(False)
i += 1
Pad with False if it ends early.
if len(result) < len(a):
result.extend([False] * (len(a) - len(result)))
print(result) # [True, True, False, True]
This answer is adapted from Better way to find matches in two sorted lists than using for loops? (Java)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With