Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

ValueError taking dot product of two sparse matrices in SciPy

I'm trying to take the dot product of two lil_matrix sparse matrices that are approx. 100,000 x 50,000 and 50,000 x 100,000 respectively.

from scipy import sparse
a = sparse.lil_matrix((100000, 50000))
b = sparse.lil_matrix((50000, 100000))

c = a.dot(b)

and getting this error:

 File "/usr/lib64/python2.6/site-packages/scipy/sparse/base.py", line 211, in dot
 return self * other
 File "/usr/lib64/python2.6/site-packages/scipy/sparse/base.py", line 247, in __mul__
 return self._mul_sparse_matrix(other)
 File "/usr/lib64/python2.6/site-packages/scipy/sparse/base.py", line 300, in      _mul_sparse_matrix
 return self.tocsr()._mul_sparse_matrix(other)
 File "/usr/lib64/python2.6/site-packages/scipy/sparse/compressed.py", line 290, in _mul_sparse_matrix
 indices = np.empty(nnz, dtype=np.intc)
 ValueError: negative dimensions are not allowed

Any ideas on what might be happening - running this on a machine with about 64GB of ram, and using about 13GB when executing the dot.

like image 452
Sripad Sriram Avatar asked Sep 07 '25 03:09

Sripad Sriram


1 Answers

This is a bad error message, but the "problem" quite simply is that your resulting matrix would be too big (has too many nonzero elements, not its dimension).

Scipy uses int32 to store indptr and indices for the sparse formats. This means that your sparsematrix cannot have more then (approximatly) 2^31 nonzero elements. Maybe you could change the code in scipy to use int64 or uint32, if this is not just a toy problem anyways. But maybe the use of sparse matrixes is not the best solution for solving this anyways?

EDIT: This is solved in the new scipy versions AFIAK.

like image 103
seberg Avatar answered Sep 10 '25 02:09

seberg