I have a bolean array of nxn elements and I want to check if any row is identical to another.If there are any identical rows, I want to check if the corresponding columns are also identical.
Here is an example:
A=np.array([[0, 1, 0, 0, 0, 1],
[0, 0, 0, 1, 0, 1],
[0, 1, 0, 0, 0, 1],
[1, 0, 1, 0, 1, 1],
[1, 1, 1, 0, 0, 0],
[0, 1, 0, 1, 0, 1]])
I would like the program to find that the first and the third row are identical, and then check if the first and the third columns are also identical; which in this case they are.
You can use np.array_equal():
for i in range(len(A)): # generate pairs
for j in range(i + 1, len(A)):
if np.array_equal(A[i], A[j]): # compare rows
if np.array_equal(A[:,i], A[:,j]): # compare columns
print(i, j)
else:
pass
or using combinations():
import itertools
for pair in itertools.combinations(range(len(A)), 2):
if np.array_equal(A[pair[0]], A[pair[1]]) and np.array_equal(A[:,pair[0]], A[:,pair[1]]): # compare columns
print(pair)
Starting with the typical way to apply np.unique to 2D arrays and have it return unique pairs:
def unique_pairs(arr):
uview = np.ascontiguousarray(arr).view(np.dtype((np.void, arr.dtype.itemsize * arr.shape[1])))
uvals, uidx = np.unique(uview, return_inverse=True)
pos = np.where(np.bincount(uidx) == 2)[0]
pairs = []
for p in pos:
pairs.append(np.where(uidx==p)[0])
return np.array(pairs)
We can then do the following:
row_pairs = unique_pairs(A)
col_pairs = unique_pairs(A.T)
for pair in row_pairs:
if np.any(np.all(pair==col_pairs, axis=1)):
print pair
>>> [0 2]
Of course there is quite a few optimizations left to do, but the main point is using np.unique. The efficiency on this method compared to others depends heavily on how you define "small" arrays.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With