Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Fast algorithm to compute Adamic-Adar

I'm working on graph analysis. I want to compute an N by N similarity matrix that contains the Adamic Adar similarity between every two vertices. To give an overview of Adamic Adar let me start with this introduction:

Given the adjacency matrix A of an undirected graph G. CN is the set of all common neighbors of two vertices x, y. A common neighbor of two vertices is one where both vertices have an edge/link to, i.e. both vertices will have a 1 for the corresponding common neighbor node in A. k_n is the degree of node n.

Adamic-Adar is defined as the following: enter image description here

My attempt to compute it is to fetch both rows of the x and y nodes from A and then sum them. Then look for the elements that has 2 as the value and then gets their degrees and apply the equation. However computing that takes really really a long of time. I tried with a graph that contains 1032 vertices and it took a lot of time to compute. It started with 7 minutes and then I cancelled the computations. So my question: is there a better algorithm to compute it?

Here's my code in python:

def aa(graph):

"""
    Calculates the Adamic-Adar index.

"""
N = graph.num_vertices()
A = gts.adjacency(graph)
S = np.zeros((N,N))
degrees = get_degrees_dic(graph)
for i in xrange(N):
    A_i = A[i]
    for j in xrange(N):
        if j != i:
            A_j = A[j]
            intersection = A_i + A_j
            common_ns_degs = list()
            for index in xrange(N):
                if intersection[index] == 2:
                    cn_deg = degrees[index]
                    common_ns_degs.append(1.0/np.log10(cn_deg))
            S[i,j] = np.sum(common_ns_degs)
return S 
like image 516
Jack Twain Avatar asked Mar 21 '14 17:03

Jack Twain


2 Answers

I believe you are using rather slow approach. It would better to revert it -
- initialize AA (Adamic-Adar) matrix by zeros
- for every node k get it's degree k_deg
- calc d = log(1.0/k_deg) (why log10 - is it important or not?)
- add d to all AAij, where i,j - all pairs of 1s in kth row of adjacency matrix
Edit:
- for sparse graphs it is useful to extract positions of all 1s in kth row to the list to reach O(V*(V+E)) complexity instead of O(V^3)

AA = np.zeros((N,N))
for k = 0 to N - 1 do
    AdjList = []
    for j = 0 to N - 1 do
        if A[k, j] = 1 then
            AdjList.Add(j)
    k_deg = AdjList.Length
    d = log(1/k_deg)
    for j = 0 to AdjList.Length - 2 do
      for i = j+1 to AdjList.Length - 1 do
         AA[AdjList[i],AdjList[j]] = AA[AdjList[i],AdjList[j]] + d  
         //half of matrix filled, it is symmetric for undirected graph
like image 197
MBo Avatar answered Sep 22 '22 09:09

MBo


Since you're using numpy, you can really cut down on your need to iterate for every operation in the algorithm. my numpy- and vectorized-fu aren't the greatest, but the below runs in around 2.5s on a graph with ~13,000 nodes:

def adar_adamic(adj_mat):    
    """Computes Adar-Adamic similarity matrix for an adjacency matrix"""

    Adar_Adamic = np.zeros(adj_mat.shape)
    for i in adj_mat:
        AdjList = i.nonzero()[0] #column indices with nonzero values
        k_deg = len(AdjList)
        d = np.log(1.0/k_deg) # row i's AA score

        #add i's score to the neighbor's entry
        for i in xrange(len(AdjList)):
            for j in xrange(len(AdjList)):
                if AdjList[i] != AdjList[j]:
                    cell = (AdjList[i],AdjList[j])
                    Adar_Adamic[cell] = Adar_Adamic[cell] + d

    return Adar_Adamic

unlike MBo's answer, this does build the full, symmetric matrix, but the inefficiency (for me) was tolerable, given the execution time.

like image 29
Mike Avatar answered Sep 19 '22 09:09

Mike