Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why is sparse-dense multiplication faster than dense-sparse multiplication?

I am curious to why multiplying a sparse-matrix by a dense-matrix takes a different time than the reverse. Are the algorithms significantly different?

Here's an example in matlab 2018a:

a=sprand(M,M,0.01);
b=rand(M);
tic;ref1=a*b;t_axb=toc
tic;ref2=b*a;t_bxa=toc

Here's an example with Eigen 3 and C++ using 1 thread:

//prepare acol=MxM ColMajor Eigen sparse matrix with 0.01 density
...
Map<Matrix<double,M,M,ColMajor> > bcol (PR, M, M );
double tic,toc;

tic=getHighResolutionTime();
result=acol*bcol;
toc=getHighResolutionTime();
printf("\nacol*bcol time: %f seconds", (toc - tic));

tic=getHighResolutionTime();
result=bcol*acol;
toc=getHighResolutionTime();
printf("\nbcol*acol time: %f seconds\n", (toc - tic));

When M=4000, the results are:

t_axb =
    0.6877
t_bxa =
    0.4803

acol*bcol time: 0.937590 seconds
bcol*acol time: 0.532622 seconds

When M=10000, the results are

t_axb =
   11.5649
t_bxa =
    9.7872

acol*bcol time: 20.140380 seconds
bcol*acol time: 8.061626 seconds

In both cases, sparse-dense product is slower than dense-sparse product for both Matlab and Eigen. I am curious as to

  1. Why is this the case? Are the algorithms for sparse-dense significantly difference than dense-sparse? The number of FLOPs is the same, right?

  2. Why does eigen match or exceed Matlab's performance for dense-sparse but not for sparse-dense product? A small difference in performance is normal, but a factor of ~1.4-1.8 difference seems strange given that both are highly optimized libraries. I am compiling eigen with all the optimizations as per the documentation. i.e. -fPIC -fomit-frame-pointer -O3 -DNDEBUG -fopenmp -march=native

like image 255
avgn Avatar asked Jul 22 '18 01:07

avgn


People also ask

Why are sparse matrices faster?

Using sparse matrices to store data that contains a large number of zero-valued elements can both save a significant amount of memory and speed up the processing of that data.

Is sparse matrix faster?

For what it is worth, for random sparse matrices of size 10,000 by 10,000 vs. dense matrices of the same size, on my Xeon workstation using MATLAB and Intel MKL as the BLAS, the sparse matrix-vector multiply was faster for densities of 15% or less.

Why is sparse matrix important than dense matrix?

Operations using standard dense-matrix structures and algorithms are slow and inefficient when applied to large sparse matrices as processing and memory are wasted on the zeros. Sparse data is by nature more easily compressed and thus requires significantly less storage.

What is the difference between dense and sparse matrix?

A matrix that has been compressed to eliminate zero-values is a sparse matrix, whereas a matrix with zero and nonzero values is a dense matrix.


1 Answers

You could observe the same difference by comparing column-major versus row-major storage for sparse-matrix times vector product: y = A * x. If A is row-major (equivalently each coeff of y), then each row of A can be treated in parallel without any overhead (no communication, no additional temporary, no additional operation). In contrast, if A is column-major multi-threading cannot come for free, and in most cases the overhead is larger than the gain.

Even without multi-threading, you see that the memory access patterns are very different:

  • Row-major: multiple random read-only accesses to x, each coeff of y being write one only.
  • Col-major: each coeff of x is read once, but we get multiple random read-write accesses to the destination y.

So even without multi-threading the situation is naturally favorable to row-major.

like image 162
ggael Avatar answered Sep 24 '22 02:09

ggael