Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Find out if two symmetric matrices are the same up to a permutation of the rows/columns

I have two symmetric (item co-occurrence) matrices A and B and want to find out if they describe the same co-occurrence, only with the row/column labels permuted. (The same permutation has to be applied to rows and columns to keep the symmetry/co-occurrence property)

For example these two matrices should be equal in my test:

a = np.array([
    #1 #2 #3 #4 #5 #6 #7
    [0, 1, 1, 0, 0, 0, 1], #1
    [1, 0, 1, 2, 1, 1, 2], #2
    [1, 1, 0, 0, 0, 0, 1], #3
    [0, 2, 0, 0, 4, 0, 4], #4
    [0, 1, 0, 4, 0, 1, 2], #5
    [0, 1, 0, 0, 1, 0, 0], #6
    [1, 2, 1, 4, 2, 0, 0]  #7
])
b = np.array([
    #5 #7 #1,3#3,1#2 #4 #6
    [0, 2, 0, 0, 1, 4, 1], #5
    [2, 0, 1, 1, 2, 4, 0], #7
    [0, 1, 0, 1, 1, 0, 0], #1,3 could be either
    [0, 1, 1, 0, 1, 0, 0], #1,3 could be either
    [1, 2, 1, 1, 0, 2, 1], #2
    [4, 4, 0, 0, 2, 0, 0], #4
    [1, 0, 0, 0, 1, 0, 0]  #6
])

I currently test if the eigenvalues are the same using numpy.linalg.eigvals (I am not even sure this is a sufficient condition), but I would like to find a test which doesn't involve numerical accuracy, since I am dealing with integers here.

like image 503
C. Yduqoli Avatar asked Oct 29 '18 09:10

C. Yduqoli


2 Answers

Here's a vectorized solution based on sorting and leveraging searchsorted -

import pandas as pd

# Sort rows for a and b
aS = np.sort(a,axis=1)
bS = np.sort(b,axis=1)

# Scale down each row to a scalar each
scale = np.r_[(np.maximum(aS.max(0),bS.max(0))+1)[::-1].cumprod()[::-1][1:],1]
aS1D = aS.dot(scale)
bS1D = bS.dot(scale)

# Use searchsorted to get the correspondence on indexing
sidx = aS1D.argsort()
searchsorted_idx = np.searchsorted(aS1D,bS1D,sorter=sidx)
searchsorted_idx[searchsorted_idx==len(aS1D)] = len(aS1D)-1
df = pd.DataFrame({'A':searchsorted_idx})
new_order = sidx[df.groupby('A').cumcount().values+searchsorted_idx]
# new_order is the permuted order, i.e. [5, 7, 1, 3, 2, 4, 6]

# Finally index into a with the new_order and compare against b
out = np.array_equal(a[new_order[:,None], new_order],b)
like image 159
Divakar Avatar answered Oct 26 '22 23:10

Divakar


I will assume that you have the list of permutation of the rows/columns of a which gives the b, e.g. something like this

p = np.array([5, 7, 1, 3, 2, 4, 6]) - 1

Then you can simply do the following on a

a_p = a[p]
a_p = a_p[:, p]

and check if b and the permuted a_p are equal:

(a_p == b).all()

Edit: since you don't have a list like the above p, you could (at least for small arrays a and b) generate the permutations of the indices and check for each one:

from itertools import permutations

def a_p(a, b, p):
    p = np.array(p)
    a_p = a[p]
    a_p = a_p[:, p]
    return a_p

for p in permutations(range(a.shape[0])):
    if (a_p(a, b, p) == b).all():
        print('True')
        break
else:
    print('False')

Note that this brute-force method works for non-symmetric matrices as well. But since the number of permutations is huge for large arrays a and b, this method can be very slow. So your solution with computing eigenvalues is much better.

Here's a benchmark:

def Yduqoli(a, b):
    ''' I suppose your solution is similar'''
    if (np.array(np.unique(a, return_counts=True)) == np.array(np.unique(b, return_counts=True))).all():
        a_eigs = np.sort(np.linalg.eigvals(a))
        b_eigs = np.sort(np.linalg.eigvals(b))
        return np.allclose(a_eigs, b_eigs)
    else:
        return False

def AndyK(a, b):
    for p in permutations(range(a.shape[0])):
        if (a_p(a, b, p) == b).all():
            return True
    return False  

%timeit AndyK(a,b)
103 ms ± 4.54 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

%timeit Yduqoli(a,b)
408 µs ± 65.4 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

where I used the symmetric matrices a and b provided by the OP.

Update: As mentioned by Paul Panzer, simply checking the eigenvalues can give incorrect result in some cases, e.g. a = np.array([[4, 0], [0, 0]]), b = np.array([[2, 2], [2, 2]]) have the same eigenvalues, but cannot be shuffled one into the other. So we first need to check if the arrays a and b have the same elements (regardless their positions).

like image 33
Andreas K. Avatar answered Oct 26 '22 23:10

Andreas K.