I've been wondering about this question for quite a while but cannot find a reference: How does Matlab transpose a sparse matrix so fast, given that it is stored in CSC (compressed sparse column) format?
Also its documentation verifies the efficiency of sparse matrix transposition:
To do this (accessing row by row), you can transpose the matrix, perform operations on the columns, and then retranspose the result … The time required to transpose the matrix is negligible.
Follow-up (modified as suggested by @Mikhail):
I agree with @Roger and @Milhail that setting a flag is enough for many operations such as the BLAS or sparse BLAS operations in terms of their interfaces. But it appears to me that Matlab does "actual" transposition. For example, I have a sparse matrix X with size m*n=7984*12411, and I want to scale each column and each row:
% scaling each column
t = 0;
for i = 1 : 1000
A = X; t0 = tic;
A = bsxfun(@times, A, rand(1,n));
t = t + toc(t0);
end
t = 0.023636 seconds
% scaling each row
t = 0;
for i = 1 : 1000
A = X; t0 = tic;
A = bsxfun(@times, A, rand(m,1));
t = t + toc(t0);
end
t = 138.3586 seconds
% scaling each row by transposing X and transforming back
t = 0;
for i = 1 : 1000
A = X; t0 = tic;
A = A'; A = bsxfun(@times, A, rand(1,m)); A = A';
t = t + toc(t0);
end
t = 19.5433 seconds
This result means that accessing column by column is faster than accessing row by row. It makes sense because sparse matrices are stored column by column. So the only reason for the fast speed of column scaling of X' should be that X is actually transposed to X' instead of setting a flag.
Also, if every sparse matrix is stored in CSC format, simply setting a flag cannot make X' in CSC format.
Any comments? Thanks in advance.
After a week's exploration, my guess about the internal mechanism of transposing a matrix is sorting.
Suppose A
is a sparse matrix,
[I, J, S] = find(A);
[sorted_I, idx] = sort(I);
J = J(idx);
S = S(idx);
B = sparse(J, sorted_I, S);
Then B
is the transpose of A
.
The above implementation has about half the efficiency of Matlab's built-in transpose
on my machine. Considering Matlab's built-in functions are multi-threaded, my guess might be a reasonable one.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With