I'm trying to port some code from Matlab to Python and I've run into a problem. I can't seem to find an equivalent to svds. I tried using numpy.corrcoef and then numpy.linalg.eig, but numpy.corrcoef doesn't work for large arrays (say 500 x 20000). Here's the code in matlab, in case it makes any difference: <pre class="prettyprint"><code>s = size(data, 2) mean = sum(data, 2)/s m_data = ( data - repmat(mean, 1, s) ) / sqrt(s - 1) [res_u,res_s] = svds(m_data, s) eigenvals = diag(res_s).^2 eigenvecs = res_u </code></pre>

If your matrix is sparse, you can use <code>scipy.sparse.linalg.svds</code> in a recent scipy distribution. It wraps the sparse SVD routine in ARPACK. In my experience, it is buggy (not sure if this is the fault of scipy or ARPACK), so I advise running tests to check that the singular values are as expected. If your matrix is dense, 500 by 20000 is big but not intractable on a commodity desktop these days. You are calling <code>numpy.linalg.svd</code> with <code>full_matrices=False</code>, right? What are you using <code>numpy.corrcoef</code> for? Nothing like that appears in the MATLAB code snippet that you posted.

Python: Equivalent to Matlab's svds(A, k) for large arrays?

Tags:

python

matlab

I'm trying to port some code from Matlab to Python and I've run into a problem. I can't seem to find an equivalent to svds.

I tried using numpy.corrcoef and then numpy.linalg.eig, but numpy.corrcoef doesn't work for large arrays (say 500 x 20000).

Here's the code in matlab, in case it makes any difference:

s = size(data, 2)
mean = sum(data, 2)/s
m_data = ( data - repmat(mean, 1, s) ) / sqrt(s - 1)
[res_u,res_s] = svds(m_data, s)
eigenvals = diag(res_s).^2
eigenvecs = res_u

948

asked Feb 11 '11 22:02

digdin

2 Answers

So what you are looking for, would be something like this in python and numpy (I took the liberty not to 'directly translate' the matlab-code to python and numpy, instead I refactored it a little bit to 'feel' more pythonic [of'course quite similar refactoring can be applied to the matlab-code as well]):

import numpy as np

def _cas(D):
    """Center at mean and standardize."""
    return (D- D.mean(1)[:, None])/ (D.shape[1]- 1)** .5

def example(D):
    """Eigenvalues and -vectors, based on SVD."""
    u, s, v= np.linalg.svd(D, full_matrices= False);
    return np.diag(s)** 2, u

if __name__ == '__main__':
    data= np.random.rand(5, 20)
    data= _cas(data) # preprocess data according to your requirements
    eigenvals, eigenvecs= example(data)
    print eigenvals
    print eigenvecs

But you have a performance problem with it?

Can you now be more specific of your current performance, and how much you really to expect it should be enhanced? FTIW in my modest computer, a random (500, 20000) matrix will spend executing example(.) some 20 secs.

I can trivially (due to basic linear algebra) reduce the execution time down to a level like 2.5 sec (allmost 10 fold improvement)! Now, if you are looking for much better performance than this, then please elaborate in much more detailed level on the 'nature' of your data!

Where is your data coming from? What is your specific case to utilize the calculated eigenvalues, and -vectors? i.e. what is your main goal to achieve?

144

answered Sep 17 '22 18:09

12 revs

If your matrix is sparse, you can use scipy.sparse.linalg.svds in a recent scipy distribution. It wraps the sparse SVD routine in ARPACK. In my experience, it is buggy (not sure if this is the fault of scipy or ARPACK), so I advise running tests to check that the singular values are as expected.

If your matrix is dense, 500 by 20000 is big but not intractable on a commodity desktop these days. You are calling numpy.linalg.svd with full_matrices=False, right?

What are you using numpy.corrcoef for? Nothing like that appears in the MATLAB code snippet that you posted.

answered Sep 17 '22 18:09

chairmanK

Related questions
                            
                                Django model i18n of content
                            
                                Proper way to import when using timeit?
                            
                                Using a prefix in RDFLIB
                            
                                Two ways to make python based webpages?
                            
                                Is my virtual environment (python) causing my PYTHONPATH to break?
                            
                                Framework Choice / Recommendations for Google App Engine [closed]
                            
                                How do I change the look and feel of the wxPython tool bar?
                            
                                VIM + Python - "gd" command not working properly
                            
                                Safe way to connect to RPC server
                            
                                Hadoop Streaming Job Failed (Not Successful) in Python
                            
                                Algorithm in Python to store and search daily occurrence for thousands of numbered events?
                            
                                Using Hunch on Google App engine
                            
                                How can I log the outgoing request of urlfetch in google app engine python?
                            
                                Defining a decorator in a class, which is also usuable within the class definition
                            
                                How to build Twisted servers which are able to do hot code swap in Python?
                            
                                Is there anything like Lisp's SLIME for Python/Django?
                            
                                Extracting an amplitude list from *.wav file for use in Python
                            
                                Python: Multi-dimensional array of different types
                            
                                Programmatically preventing exhaustion of allocated Google App Engine Quotas
                            
                                Sqlalchemy get last X rows in order

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With