Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Solving multiple linear systems using vectorization

Sorry if this is obvious but I searched a while and did not find anything (or missed it).

I'm trying to solve linear systems of the form Ax=B with A a 4x4 matrix, and B a 4x1 vector.

I know that for a single system I can use mldivide to obtain x: x=A\B.

However I am trying to solve a great number of systems (possibly > 10000) and I am reluctant to use a for loop because I was told it is notably slower than matrix formulation in many MATLAB problems.

My question is then: is there a way to solve Ax=B using vectorization with A 4x4x N and B a matrix 4x N ?

PS: I do not know if it is important but the B vector is the same for all the systems.

like image 506
Aabaz Avatar asked Jun 14 '11 14:06

Aabaz


4 Answers

You should use a for loop. There might be a benefit in precomputing a factorization and reusing it, if A stays the same and B changes. But for your problem where A changes and B stays the same, there's no alternative to solving N linear systems.

You shouldn't worry too much about the performance cost of loops either: the MATLAB JIT compiler means that loops can often be just as fast on recent versions of MATLAB.

like image 107
Tom Avatar answered Nov 12 '22 21:11

Tom


I don't think you can optimize this further. As explained by @Tom, since A is the one changing, there is no benefit in factoring the various A's beforehand...

Besides the looped solution is pretty fast given the dimensions you mention:

A = rand(4,4,10000);
B = rand(4,1);          %# same for all linear systems

tic
X = zeros(4,size(A,3));
for i=1:size(A,3)
    X(:,i) = A(:,:,i)\B;
end
toc

Elapsed time is 0.168101 seconds.

like image 29
Amro Avatar answered Nov 12 '22 22:11

Amro


Here's the problem: you're trying to perform a 2D operation (mldivide) on a 3d matrix. No matter how you look at it, you need reference the matrix by index which is where the time penalty kicks in... it's not the for loop which is the problem, but it's how people use them. If you can structure your problem differently, then perhaps you can find a better option, but right now you have a few options:

1 - mex

2 - parallel processing (write a parfor loop)

3 - CUDA

like image 3
Rasman Avatar answered Nov 12 '22 21:11

Rasman


Here's a rather esoteric solution that takes advantage of MATLAB's peculiar optimizations. Construct an enormous 4k x 4k sparse matrix with your 4x4 blocks down the diagonal. Then solve all simultaneously.

On my machine this gets the same solution up to single precision accuracy as @Amro/Tom's for-loop solution, but faster.

n = size(A,1);
k = size(A,3);
AS = reshape(permute(A,[1 3 2]),n*k,n);
S = sparse( ...
  repmat(1:n*k,n,1)', ...
  bsxfun(@plus,reshape(repmat(1:n:n*k,n,1),[],1),0:n-1), ...
  AS, ...
  n*k,n*k);
X = reshape(S\repmat(B,k,1),n,k);

for a random example:

For k = 10000
For loop: 0.122570 seconds.
Giant sparse system: 0.032287 seconds.

If you know that your 4x4 matrices are positive definite then you can use chol on S to improve the accuracy.

This is silly. But so is how slow matlab's for loops still are in 2015, even with JIT. This solution seems to find a sweet spot when k is not too large so everything still fits into memory.

like image 3
Alec Jacobson Avatar answered Nov 12 '22 21:11

Alec Jacobson