I have a bolean array of nxn elements and I want to check if any row is identical to another.If there are any identical rows, I want to check if the corresponding columns are also identical.
Here is an example:
A=np.array([[0, 1, 0, 0, 0, 1],
[0, 0, 0, 1, 0, 1],
[0, 1, 0, 0, 0, 1],
[1, 0, 1, 0, 1, 1],
[1, 1, 1, 0, 0, 0],
[0, 1, 0, 1, 0, 1]])
I would like the program to find that the first and the third row are identical, and then check if the first and the third columns are also identical; which in this case they are.
You can use np.array_equal():
for i in range(len(A)): # generate pairs
for j in range(i + 1, len(A)):
if np.array_equal(A[i], A[j]): # compare rows
if np.array_equal(A[:,i], A[:,j]): # compare columns
print(i, j)
else:
pass
or using combinations():
import itertools
for pair in itertools.combinations(range(len(A)), 2):
if np.array_equal(A[pair[0]], A[pair[1]]) and np.array_equal(A[:,pair[0]], A[:,pair[1]]): # compare columns
print(pair)
Starting with the typical way to apply np.unique
to 2D arrays and have it return unique pairs:
def unique_pairs(arr):
uview = np.ascontiguousarray(arr).view(np.dtype((np.void, arr.dtype.itemsize * arr.shape[1])))
uvals, uidx = np.unique(uview, return_inverse=True)
pos = np.where(np.bincount(uidx) == 2)[0]
pairs = []
for p in pos:
pairs.append(np.where(uidx==p)[0])
return np.array(pairs)
We can then do the following:
row_pairs = unique_pairs(A)
col_pairs = unique_pairs(A.T)
for pair in row_pairs:
if np.any(np.all(pair==col_pairs, axis=1)):
print pair
>>> [0 2]
Of course there is quite a few optimizations left to do, but the main point is using np.unique
. The efficiency on this method compared to others depends heavily on how you define "small" arrays.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With