Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Multiplying large sparse matrices in python

I would like to multiply two large sparse matrices. The first is 150,000x300,000 and the second is 300,000x300,000. The first matrix has about 1,000,000 non-zero items and the second matrix has about 20,000,000 non-zero items. Is there a straightforward way to get the product of these matrices?

I'm currently storing the matrices in csr or csc format and trying matrix_a * matrix_b. This gives the error ValueError: array is too big.

I'm guessing I could store the separate matrices on disk with pytables, pull them apart into smaller blocks, and construct the final matrix product from the products of many blocks. But I'm hoping for something relatively simple to implement.

EDIT: I'm hoping for a solution that works for arbitrarily large sparse matrices, while hiding (or avoiding) the bookkeeping involved in moving individual blocks back and forth between memory and disk.

like image 870
DanB Avatar asked Jun 14 '12 05:06

DanB


People also ask

How do you multiply sparse matrices in Python?

We use the multiply() method provided in both csc_matrix and csr_matrix classes to multiply two sparse matrices. We can multiply two matrices of same format( both matrices are csc or csr format) and also of different formats ( one matrix is csc and other is csr format).

How do you multiply sparse matrices?

To Multiply the matrices, we first calculate transpose of the second matrix to simplify our comparisons and maintain the sorted order. So, the resultant matrix is obtained by traversing through the entire length of both matrices and summing the appropriate multiplied values.

How do you multiply a 3X3 matrix in Python?

Multiplication can be done using nested loops. Following program has two matrices x and y each with 3 rows and 3 columns. The resultant z matrix will also have 3X3 structure. Element of each row of first matrix is multiplied by corresponding element in column of second matrix.

How do you multiply 2d arrays in Python?

Step1: input two matrix. Step 2: nested for loops to iterate through each row and each column. Step 3: take one resultant matrix which is initially contains all 0. Then we multiply each row elements of first matrix with each elements of second matrix, then add all multiplied value.


1 Answers

Strange, because the following worked for me:

import scipy.sparse
mat1 = scipy.sparse.rand(150e3, 300e3, density=1e6/150e3/300e3)
mat2 = scipy.sparse.rand(300e3, 300e3, density=20e6/150e3/300e3)
cmat1 = scipy.sparse.csc_matrix(mat1)
cmat2 = scipy.sparse.csc_matrix(mat2)
res = cmat1 * cmat2

I'm using the latest scipy. And the amount of RAM used by python was ~3GB

So maybe your matrices are such that their product is not very sparse ?

like image 53
sega_sai Avatar answered Oct 21 '22 23:10

sega_sai