Why is sparse-dense multiplication faster than dense-sparse multiplication?

Tags:

I am curious to why multiplying a sparse-matrix by a dense-matrix takes a different time than the reverse. Are the algorithms significantly different?

Here's an example in matlab 2018a:

a=sprand(M,M,0.01);
b=rand(M);
tic;ref1=a*b;t_axb=toc
tic;ref2=b*a;t_bxa=toc

Here's an example with Eigen 3 and C++ using 1 thread:

//prepare acol=MxM ColMajor Eigen sparse matrix with 0.01 density
...
Map<Matrix<double,M,M,ColMajor> > bcol (PR, M, M );
double tic,toc;

tic=getHighResolutionTime();
result=acol*bcol;
toc=getHighResolutionTime();
printf("\nacol*bcol time: %f seconds", (toc - tic));

tic=getHighResolutionTime();
result=bcol*acol;
toc=getHighResolutionTime();
printf("\nbcol*acol time: %f seconds\n", (toc - tic));

When M=4000, the results are:

t_axb =
    0.6877
t_bxa =
    0.4803

acol*bcol time: 0.937590 seconds
bcol*acol time: 0.532622 seconds

When M=10000, the results are

t_axb =
   11.5649
t_bxa =
    9.7872

acol*bcol time: 20.140380 seconds
bcol*acol time: 8.061626 seconds

In both cases, sparse-dense product is slower than dense-sparse product for both Matlab and Eigen. I am curious as to

Why is this the case? Are the algorithms for sparse-dense significantly difference than dense-sparse? The number of FLOPs is the same, right?
Why does eigen match or exceed Matlab's performance for dense-sparse but not for sparse-dense product? A small difference in performance is normal, but a factor of ~1.4-1.8 difference seems strange given that both are highly optimized libraries. I am compiling eigen with all the optimizations as per the documentation. i.e. -fPIC -fomit-frame-pointer -O3 -DNDEBUG -fopenmp -march=native

255

asked Jul 22 '18 01:07

avgn

1 Answers

You could observe the same difference by comparing column-major versus row-major storage for sparse-matrix times vector product: y = A * x. If A is row-major (equivalently each coeff of y), then each row of A can be treated in parallel without any overhead (no communication, no additional temporary, no additional operation). In contrast, if A is column-major multi-threading cannot come for free, and in most cases the overhead is larger than the gain.

Even without multi-threading, you see that the memory access patterns are very different:

Row-major: multiple random read-only accesses to x, each coeff of y being write one only.
Col-major: each coeff of x is read once, but we get multiple random read-write accesses to the destination y.

So even without multi-threading the situation is naturally favorable to row-major.

162

answered Sep 24 '22 02:09

ggael

Related questions
                            
                                Dynamically adapt number of self defined ticks
                            
                                How to remove extra spaces in between string, matlab?
                            
                                Why does MATLABs "imwrite" scale 12-bit images and how to circumvent this?
                            
                                Subpixel edge detection for almost vertical edges
                            
                                Octave: how can these FOR loops be vectorized?
                            
                                MATLAB add path temporarily [duplicate]
                            
                                How to sort a structure array
                            
                                How to do definitely associate colors to values in a heatmap in Matlab?
                            
                                what is the equivalent command 'end' of Matlab in python? [duplicate]
                            
                                Find a nearly circular band of bright pixels in this image
                            
                                identify the adjacent pixels in matlab
                            
                                Best way to flatten a 2D matrix to 1D when sliced from a 3D matrix variable
                            
                                Decomposing the numerator and the denominator polynomials into their even and odd parts
                            
                                How to put comment in a line-continuation [duplicate]
                            
                                How do you sequentially flip each dimension in a NumPy array?
                            
                                Matlab code would not plot part of the function
                            
                                Read a float binary file into 2D arrays in python and matlab
                            
                                Matlab - Handle object properties of unique objects refer to the same object?
                            
                                What is the run time complexity of integer linear programming (ILP)?
                            
                                Zero-dimensional numpy.ndarray : only element is a 2D array : how to access it?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Why is sparse-dense multiplication faster than dense-sparse multiplication?

Tags:

matlab

sparse-matrix

algebra

eigen

avgn

People also ask

1 Answers

ggael

Recent Activity

Donate For Us