Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Scipy sparse matrices element wise multiplication

I am trying to do an element-wise multiplication for two large sparse matrices. Both are of size around (400K X 500K), with around 100M elements.

However, they might not have non-zero elements in the same positions, and they might not have the same number of non-zero elements. In either situation, Im okay with multiplying the non-zero value of one matrix and the zero value in the other matrix to zero.

I keep running out of memory (8GB) in every approach, which doesnt make much sense. I shouldnt be. These are what I've tried.

A and B are sparse matrices (Ive tried with COO and CSC formats).

# I have loaded sparse matrices A and B, and have a file opened in write mode
row,col = A.nonzero()
index = zip(row,col)
del row,col
for i,j in index :
    # Approach 1
    A[i,j] *= B[i,j]

    # Approach 2
    someopenfile.write(' '.join([str(i),str(j),str(A[j,j]*B[i,j]),'\n']))

    # Approach 3
    if B[i,j] != 0 :
        A[i,j] = A[i,j]*B[i,j] # or, I wrote it to a file instead 
                               # like in approach 2

If I comment out the for loop, I see that I use almost 3.5GB of memory. But the moment I use the loop, whether Im writing the products to a file or back to a matrix, the memory usage shoots up to the full memory, causing me to stop the execution, or the system hangs. How can I do this operation without consuming so much memory?

like image 643
Avisek Avatar asked Jan 28 '15 08:01

Avisek


1 Answers

I suspect that your sparse matrices are becoming non sparse when you perform the operation have you tried just:

A.multiply(B)

As I suspect that it will be better optimised than anything that you can easily do.

If A is not already the correct type of sparse matrix you might need:

A = A.tocsr()
# May also need 
# B = B.tocsr()
A = A.multiply(B)
like image 199
Steve Barnes Avatar answered Nov 04 '22 15:11

Steve Barnes