scipy.sparse dot extremely slow in Python

Tags:

The following code will not even finish on my system:

import numpy as np
from scipy import sparse
p = 100
n = 50
X = np.random.randn(p,n)
L = sparse.eye(p,p, format='csc')
X.T.dot(L).dot(X)

Is there any explanation why this matrix multiplication is hanging?

292

asked Jan 07 '13 21:01

bluecat

1 Answers

X.T.dot(L) is not, as you may think, a 50x100 matrix, but an array of 50x100 sparse matrices of 100x100

>>> X.T.dot(L).shape
(50, 100)
>>> X.T.dot(L)[0,0]
<100x100 sparse matrix of type '<type 'numpy.float64'>'
    with 100 stored elements in Compressed Sparse Column format>

It seems that the problem is that X's dot method, it being an array, doesn't know about sparse matrices. So you must either convert the sparse matrix to dense using its todense or toarray method. The former returns a matrix object, the latter an array:

>>> X.T.dot(L.todense()).dot(X)
matrix([[  81.85399873,    3.75640482,    1.62443625, ...,    6.47522251,
            3.42719396,    2.78630873],
        [   3.75640482,  109.45428475,   -2.62737229, ...,   -0.31310651,
            2.87871548,    8.27537382],
        [   1.62443625,   -2.62737229,  101.58919604, ...,    3.95235372,
            1.080478  ,   -0.16478654],
        ..., 
        [   6.47522251,   -0.31310651,    3.95235372, ...,   95.72988689,
          -18.99209596,   17.31774553],
        [   3.42719396,    2.87871548,    1.080478  , ...,  -18.99209596,
          108.90045569,  -16.20312682],
        [   2.78630873,    8.27537382,   -0.16478654, ...,   17.31774553,
          -16.20312682,  105.37102461]])

Alternatively, sparse matrices have a dot method that knows about arrays:

>>> X.T.dot(L.dot(X))
array([[  81.85399873,    3.75640482,    1.62443625, ...,    6.47522251,
           3.42719396,    2.78630873],
       [   3.75640482,  109.45428475,   -2.62737229, ...,   -0.31310651,
           2.87871548,    8.27537382],
       [   1.62443625,   -2.62737229,  101.58919604, ...,    3.95235372,
           1.080478  ,   -0.16478654],
       ..., 
       [   6.47522251,   -0.31310651,    3.95235372, ...,   95.72988689,
         -18.99209596,   17.31774553],
       [   3.42719396,    2.87871548,    1.080478  , ...,  -18.99209596,
         108.90045569,  -16.20312682],
       [   2.78630873,    8.27537382,   -0.16478654, ...,   17.31774553,
         -16.20312682,  105.37102461]])

answered Oct 03 '22 23:10

Jaime

Related questions
                            
                                Why is my file upload (to a Flask server) not appearing in request.files but is appearing in request.stream?
                            
                                Matplotlib plot pulse propagation in 3d
                            
                                Distribute/distutils specify Python version
                            
                                How to use numpy to add any two elements in an array and produce a matrix?
                            
                                Apache SSL vs Python Simple HTTP Server SSL security questions
                            
                                py.test: how to automatically detect an exception in a child process?
                            
                                Python How to use extended path length
                            
                                Creating subplots with differing shapes in matplotlib
                            
                                Embedding Python with C
                            
                                Different behaviour between python console and python script
                            
                                Using the tornado RequestHandler is it possible to get POST data without specifying a argument?
                            
                                User input variables in cx_Oracle?
                            
                                Python Speedup np.unique
                            
                                In nested classes, how to access outer class's elements from nested class in Python?
                            
                                How can i use scrapy shell to with parameters on url
                            
                                Processing a large amount of data in parallel
                            
                                import error due to bs4 vs BeautifulSoup
                            
                                What is the correct way to make SQLalchemy store strings as lowercase?
                            
                                how to crawl a site only given domain url with scrapy
                            
                                Python's glob module and unix' find command don't recognize non-ascii

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

scipy.sparse dot extremely slow in Python

Tags:

python

numpy

scipy

sparse-matrix

bluecat

People also ask

1 Answers

Jaime

Recent Activity

Donate For Us