I'm working to implement the following equation: <pre class="prettyprint"><code>X =(Y.T * Y + Y.T * C * Y) ^ -1 </code></pre> Y is a (n x f) matrix and C is (n x n) diagonal one; n is about 300k and f will vary between 100 and 200. As part of an optimization process this equation will be used almost 100 million times so it has to be processed really fast. Y is initialized randomly and C is a very sparse matrix with only a few numbers out of the 300k on the diagonal will be different than 0.Since Numpy's diagonal functions creates dense matrices, I created C as a sparse csr matrix. But when trying to solve the first part of the equation: <pre class="prettyprint"><code>r = dot(C, Y) </code></pre> The computer crashes due Memory limits. I decided then trying to convert Y to csr_matrix and make the same operation: <pre class="prettyprint"><code>r = dot(C, Ysparse) </code></pre> and this approach took 1.38 ms. But this solution is somewhat "tricky" since I'm using a sparse matrix to store a dense one, I wonder how efficient this really. So my question is if is there some way of multiplying the sparse C and the dense Y without having to turn Y into sparse and improve performance? If somehow C could be represented as diagonal dense without consuming tons of memory maybe this would lead to very efficient performance but I don't know if this is possible. I appreciate your help!

The reason the dot product runs into memory issues when computing r = dot(C,Y) is because numpy's dot function does not have native support for handling sparse matrices. What is happening is numpy thinks of the sparse matrix C as a python object, and not a numpy array. If you inspect on small scale you can see the problem first hand: <pre class="prettyprint"><code>>>> from numpy import dot, array >>> from scipy import sparse >>> Y = array([[1,2],[3,4]]) >>> C = sparse.csr_matrix(array([[1,0], [0,2]])) >>> dot(C,Y) array([[ (0, 0) 1 (1, 1) 2, (0, 0) 2 (1, 1) 4], [ (0, 0) 3 (1, 1) 6, (0, 0) 4 (1, 1) 8]], dtype=object) </code></pre> Clearly the above is not the result you are interested in. Instead what you want to do is compute using scipy's sparse.csr_matrix.dot function: <pre class="prettyprint"><code>r = sparse.csr_matrix.dot(C, Y) </code></pre> or more compactly <pre class="prettyprint"><code>r = C.dot(Y) </code></pre>

I don't know if it was possible when the question was asked; but nowadays, broadcasting is your friend. An n*n diagonal matrix needs only be an array of the diagonal elements to be used in a matrix product: <pre class="prettyprint"><code>>>> n, f = 5, 3 >>> Y = np.random.randint(0, 10, (n, f)) >>> C = np.random.randint(0, 10, (n,)) >>> Y.shape (5, 3) >>> C.shape (5,) >>> np.all(Y.T @ np.diag(C) @ Y == Y.T*C @ Y) True </code></pre> Do note that <code>Y.T*C @ Y</code> is non-associative: <pre class="prettyprint"><code>>>> Y.T*(C @ Y) Traceback (most recent call last): File "<stdin>", line 1, in <module> ValueError: operands could not be broadcast together with shapes (3,5) (3,) </code></pre> But <code>Y.T @ (C[:, np.newaxis]*Y)</code> would yield the expected result: <pre class="prettyprint"><code>>>> np.all(Y.T*C @ Y == Y.T@(C[:, np.newaxis]*Y)) True </code></pre>

Multiplying Numpy/Scipy Sparse and Dense Matrices Efficiently

Tags:

performance

python

numpy

scipy

sparse-matrix

I'm working to implement the following equation:

X =(Y.T * Y + Y.T * C * Y) ^ -1

Y is a (n x f) matrix and C is (n x n) diagonal one; n is about 300k and f will vary between 100 and 200. As part of an optimization process this equation will be used almost 100 million times so it has to be processed really fast.

Y is initialized randomly and C is a very sparse matrix with only a few numbers out of the 300k on the diagonal will be different than 0.Since Numpy's diagonal functions creates dense matrices, I created C as a sparse csr matrix. But when trying to solve the first part of the equation:

r = dot(C, Y)

The computer crashes due Memory limits. I decided then trying to convert Y to csr_matrix and make the same operation:

r = dot(C, Ysparse)

and this approach took 1.38 ms. But this solution is somewhat "tricky" since I'm using a sparse matrix to store a dense one, I wonder how efficient this really.

So my question is if is there some way of multiplying the sparse C and the dense Y without having to turn Y into sparse and improve performance? If somehow C could be represented as diagonal dense without consuming tons of memory maybe this would lead to very efficient performance but I don't know if this is possible.

I appreciate your help!

604

asked Nov 07 '12 15:11

Willian Fuks

4 Answers

The reason the dot product runs into memory issues when computing r = dot(C,Y) is because numpy's dot function does not have native support for handling sparse matrices. What is happening is numpy thinks of the sparse matrix C as a python object, and not a numpy array. If you inspect on small scale you can see the problem first hand:

>>> from numpy import dot, array
>>> from scipy import sparse
>>> Y = array([[1,2],[3,4]])
>>> C = sparse.csr_matrix(array([[1,0], [0,2]]))
>>> dot(C,Y)
array([[  (0, 0)    1
  (1, 1)    2,   (0, 0) 2
  (1, 1)    4],
  [  (0, 0) 3
  (1, 1)    6,   (0, 0) 4
  (1, 1)    8]], dtype=object)

Clearly the above is not the result you are interested in. Instead what you want to do is compute using scipy's sparse.csr_matrix.dot function:

r = sparse.csr_matrix.dot(C, Y)

or more compactly

r = C.dot(Y)

answered Sep 19 '22 14:09

M.H.

Try:

import numpy as np
from scipy import sparse

f = 100
n = 300000

Y = np.random.rand(n, f)
Cdiag = np.random.rand(n) # diagonal of C
Cdiag[np.random.rand(n) < 0.99] = 0

# Compute Y.T * C * Y, skipping zero elements
mask = np.flatnonzero(Cdiag)
Cskip = Cdiag[mask]

def ytcy_fast(Y):
    Yskip = Y[mask,:]
    CY = Cskip[:,None] * Yskip  # broadcasting
    return Yskip.T.dot(CY)

%timeit ytcy_fast(Y)

# For comparison: all-sparse matrices
C_sparse = sparse.spdiags([Cdiag], [0], n, n)
Y_sparse = sparse.csr_matrix(Y)
%timeit Y_sparse.T.dot(C_sparse * Y_sparse)

My timings:

In [59]: %timeit ytcy_fast(Y)
100 loops, best of 3: 16.1 ms per loop

In [18]: %timeit Y_sparse.T.dot(C_sparse * Y_sparse)
1 loops, best of 3: 282 ms per loop

answered Sep 20 '22 14:09

pv.

First, are you really sure you need to perform a full matrix inversion in your problem ? Most of the time, one only really need to compute x = A^-1 y which is a much easier problem to solve.

If this is really so, I would consider computing an approximation of the inverse matrix instead of the full matrix inversion. Since matrix inversion is really costly. See for example the Lanczos algorithm for an efficient approximation of the inverse matrix. The approximation can be stored sparsely as a bonus. Plus, it requires only matrix-vector operations so you don't even have to store the full matrix to inverse.

As an alternative, using pyoperators, you can also use to .todense method to compute the matrix to inverse using efficient matrix vector operations. There is a special sparse container for diagonal matrices.

For an implementation of the Lanczos algorithm, you can have a look at pyoperators (disclaimer: I am one of the coauthor of this piece of software).

answered Sep 20 '22 14:09

Nicolas Barbey

I don't know if it was possible when the question was asked; but nowadays, broadcasting is your friend. An n*n diagonal matrix needs only be an array of the diagonal elements to be used in a matrix product:

>>> n, f = 5, 3
>>> Y = np.random.randint(0, 10, (n, f))
>>> C = np.random.randint(0, 10, (n,))
>>> Y.shape
(5, 3)
>>> C.shape
(5,)
>>> np.all(Y.T @ np.diag(C) @ Y == Y.T*C @ Y)
True

Do note that Y.T*C @ Y is non-associative:

>>> Y.T*(C @ Y)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: operands could not be broadcast together with shapes (3,5) (3,)

But Y.T @ (C[:, np.newaxis]*Y) would yield the expected result:

>>> np.all(Y.T*C @ Y == Y.T@(C[:, np.newaxis]*Y))
True

answered Sep 18 '22 14:09

Aubergine

Related questions
                            
                                Why is calling float() on a number slower than adding 0.0 in Python?
                            
                                Python. Get structure from a data.frame
                            
                                Sampling one record per unique value (pandas, python)
                            
                                Python matplotlib install issue on Windows 7 for freetype, png packages
                            
                                Applying Format to Entire Row Openpyxl
                            
                                Write a 2d array to a csv file with delimiter [duplicate]
                            
                                Accessing Matplotlib Text Object Label Text
                            
                                Does Python evaluate type hinting of a forward reference?
                            
                                'poetry install' command fails; *.whl files are not found
                            
                                How to specify uniqueness for a tuple of field in a Django model
                            
                                Add timeout argument to python's Queue.join()
                            
                                How to use Python and Google's Protocol Buffers to deserialize data sent over TCP
                            
                                How do I abort a socket.recv() from another thread in Python
                            
                                How to test a regex password in Python?
                            
                                How can I draw text with different stroke and fill colors on images with python?
                            
                                django: Fat models and skinny controllers?
                            
                                creating pandas data frame from multiple files
                            
                                Efficient standard basis vector with numpy
                            
                                What does __init__ method return in python
                            
                                pandas: set values with (row, col) indices

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With