Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Which is the best way to multiply a large and sparse matrix with its transpose?

I currently want to multiply a large sparse matrix(~1M x 200k) with its transpose. The values of the resulting matrix would be in float.

  • I tried loading the matrix in scipy's sparse matrix and by multiplying each row of first matrix with the second matrix. The multiplication took ~2hrs to complete.

What is the efficient way to achieve this multiplication? Because I see a pattern in the computation.

  • The matrix being large and sparse .
  • The multiplication of a matrix with its transpose. So, the resulting matrix would be symmetric.

I would like to know what libraries can achieve the computation faster. It can be in Python, R, C, C++ or any other one.

like image 793
sravan_kumar Avatar asked Jul 04 '14 04:07

sravan_kumar


People also ask

How do you multiply a sparse matrix?

To Multiply the matrices, we first calculate transpose of the second matrix to simplify our comparisons and maintain the sorted order. So, the resultant matrix is obtained by traversing through the entire length of both matrices and summing the appropriate multiplied values.

Which matrix multiplication is faster?

In linear algebra, the Strassen algorithm, named after Volker Strassen, is an algorithm for matrix multiplication. It is faster than the standard matrix multiplication algorithm for large matrices, with a better asymptotic complexity, although the naive algorithm is often better for smaller matrices.

How do you multiply large matrices?

Multiplying Larger Matrices For the entry in the ith row and the jth column of the product matrix, multiply each entry in the ith row of the first matrix by the corresponding entry in the jth column of the second matrix and adding the results.


2 Answers

def URMdist(URM):
    NLin=URM.shape[0]
    NCol=URM.shape[1]
    URMt=URM.T
    Result = lil_matrix((NLin,NLin))
    for Lin in range(0,NLin):
        X = 0.0
        for Col in range(Lin,NLin):
            X = URM[Col,:].dot(URMt[:,Lin])
            if X != 0.0: 
                Result[Lin,Col] = Result[Col,Lin] = X
    return Result
like image 168
Rodrigo Coelho Avatar answered Oct 01 '22 19:10

Rodrigo Coelho


I suppose your main need is to save memory. First as you multiply a matrix with its transpose, you do not need any memeory for the transpose : all of its cells are directly accessible through first matrix (tA[i,j] = A[j,i]). Almost 1/3 of memory saved.

I can see that computation time cannot be neglected too. As the resulting matrix will be symetric, you can compute only one half and directly store the other. Near half of computation time saved.

And if you are sure that you initial matrix is really sparse, and so can hope the resulting one will be too, you can directly store the result in a scipy sparse matrix, COO format : only three lists to store the non null values.

But ... I do not know any library to do that and you will have to code it yourself in your prefered language (probably python as you spoke of scipy).

Python code example (matrix = A[M][N])

I = []
J = []
V = []
for i in range(M):
    for j in range(i:M) :
        X = 0.0
        for k in range(N):
            X += A[i ][k] * A[k][j]
        if X != 0.0 # or abs (X) > epsilon if floating point accuracy is a concern ... 
            I.append (i )
            J.append(j)
            V.append(X)
            I.append (j )
            J.append(i)
            V.append(X)

And I, J, V are what is needed for a scipy COO sparse matrix via :

RESULT = sparse.coo_matrix((V,(I,J)),shape=(N, N))
like image 28
Serge Ballesta Avatar answered Oct 01 '22 18:10

Serge Ballesta