How to approximate correlation matrix in large sparse scipy matrices?

Question

For the purpose I used the solution from that thread link by now, however it gives memory error as expected since my matrix A size is 6 million to 40000 matrix. Therefore I am looking for any other solution nevertheless to approximate the correlation matrix. How can I vaccinate that problem? Any help is appreciated.

cyborg · Accepted Answer

Your problem is that you can't hold the result in memory (6e6^2 values?).

You can drop rows from the original matrix. If, for example, you are searching for highly correlated rows, you may want to cluster the rows, in order to break the problem.

You can also use scipy.sparse.linalg.svds to shrink the number of columns. But you will still have to handle rows^2 correlations.

How to approximate correlation matrix in large sparse scipy matrices?

Tags:

python

matrix

numpy

scipy

erogol

1 Answers

cyborg

Recent Activity

Donate For Us

How to approximate correlation matrix in large sparse scipy matrices?

Tags:

python

matrix

numpy

scipy

erogol

1 Answers

cyborg

Related questions

Recent Activity

Donate For Us