When should I be using `sparse`?

Tags:

sparse-matrix

I've been looking through Matlab's sparse documentation trying to find whether there are any guidelines for when it makes sense to use a sparse representation rather than a full representation.

For example, I have a matrix data with around 30% nonzero entries. I can check the memory used.

whos data
  Name             Size                 Bytes  Class     Attributes

  data      84143929x11            4394073488  double    sparse

data = full(data);
whos data
  Name             Size                 Bytes  Class     Attributes

  data      84143929x11            7404665752  double

Here, I'm clearly saving memory, but would this be true of any matrix with 30% nonzero entries? What about 50% nonzero entries? Is there a rule of thumb for at what percentage I should switch to a full matrix?

What about computationally? Is it as a rule slower or faster to do a matrix multiplication with a sparse matrix? Sparse Matrix Operations says that

The computational complexity of sparse operations is proportional to nnz, the number of nonzero elements in the matrix. Computational complexity also depends linearly on the row size m and column size n of the matrix, but is independent of the product m*n, the total number of zero and nonzero elements.

This is difficult to compare to a full matrix without knowing more details.

Scipy's sparse matrix library explains pros and cons of each sparse format. For example for the csc_matrix

Advantages of the CSC format

efficient arithmetic operations CSC + CSC, CSC * CSC, etc.

efficient column slicing

fast matrix vector products (CSR, BSR may be faster)

Disadvantages of the CSC format

slow row slicing operations (consider CSR)

changes to the sparsity structure are expensive (consider LIL or DOK)

Does similar information about Matlab's sparse implementation exist? If so where can I find it?

757

asked Apr 21 '15 21:04

Cecilia

1 Answers

Many operations on full matrices use BLAS/LAPACK library calls that are insanely optimized and tough to beat. In practice, operations on sparse matrices will only outperform those on full matrices in specialized situations that can sufficiently exploit (i) sparsity and (ii) special matrix structure.

Just randomly using sparse probably will make you worse off. Example: which is faster, adding a 10000x10000 full matrix to a 10000x10000 full matrix? Or adding a 10000x10000 full matrix to an entirely sparse (i.e. everything is zero) 10000x10000 matrix? try it! On my system, the full + full is faster!

What are some examples of situations where sparse CRUSHES full?

Example 1: solving linear system A*x=b where A is 5000x5000 but is block diagonal matrix constructed of 500 5x5 blocks. Setup code:

As = sparse(rand(5, 5));
for(i=1:999)
   As = blkdiag(As, sparse(rand(5,5))); 
end;                         %As is made up of 500 5x5 blocks along diagonal
Af = full(As); b = rand(5000, 1);

Then you can test speed difference:

As \ b % operation on sparse As takes .0012 seconds
Af \ b % solving with full Af takes about 2.3 seconds

In general, a 5000 variable linear system is somewhat difficult, but 1000 separate 5 variable linear systems is trivial. The latter is basically what gets solved in the sparse case.

The overall story is that if you have special matrix structure and can cleverly exploit sparsity, it's possible to solve insanely large problems that otherwise would be intractable. If you have a specialized problem that is sufficiently large, have a matrix that is sufficiently sparse, and are clever with linear algebra (so as to preserve sparsity), a sparse typed matrix can be extremely powerful.

On the other hand, randomly throwing in sparse without deep, careful thought is almost certainly going to make your code slower.

116

answered Oct 21 '22 19:10

Matthew Gunn

Related questions
                            
                                inpolygon for Python - Examples of matplotlib.path.Path contains_points() method?
                            
                                Implementation of parallel coordinates? [closed]
                            
                                print variable-name in Matlab
                            
                                Can I assign types to class properties in MATLAB?
                            
                                How to get the largest element index in matlab array
                            
                                Can a Python function return only the second of two values?
                            
                                How can I display numbers with higher precision in a MATLAB data cursor?
                            
                                How to check in MATLAB if a vector only contains zeros?
                            
                                Replace values in matrix with other values
                            
                                Assign a value to multiple cells in matlab
                            
                                Remove noise from wav file, MATLAB
                            
                                Matlab - How to make a figure current? How to make an axes current?
                            
                                process a list of files with a specific extension name in matlab
                            
                                Does Matlab eig always returns sorted values?
                            
                                Grid detection in matlab
                            
                                draw ellipse and ellipsoid in MATLAB
                            
                                Matlab segmentation fault when iterating vector assignment
                            
                                Is there a shortcut to execute the current line in Matlab code?
                            
                                Is there a way of selectively including code when publishing in Matlab?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With