Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to vectorize double summation in matlab

I'm trying to produce vectorized code for this equation (and multiple others of the same form):

This will be evaluated in ode45 and the only part that changes is Rmn(t) so I am pre-computing the Bessel and sin functions. At the moment my code looks like this:

for iR = 1:nR
    p(:, iR) = sum(Rmn.*J0m(:, :, iR),1)*sinanz;
end

M, N are the number of terms in my summation and R, Z are the number of r and z coordinates I'm using. Rmn is an M*N matrix. J0m is a M*N*R array. It is an M*R matrix that is repeated N times. sinanz is an N*Z matrix. J0m and sinanz are pre-computed and don't change.

This works but is slow and so I am trying to optimize it. I figured the first step would be to reduce J0m so it is only m*R but I can't figure out how. I am looking for any suggestions on how to do this and any suggestions on how to optimize this in general.

like image 457
Jazradel Avatar asked Oct 21 '13 04:10

Jazradel


People also ask

How do you do a vector sum in MATLAB?

S = sum( A ) returns the sum of the elements of A along the first array dimension whose size does not equal 1. If A is a vector, then sum(A) returns the sum of the elements. If A is a matrix, then sum(A) returns a row vector containing the sum of each column.

What does || do in MATLAB?

The "|" operator is an element-wise operator, intended to be used on arrays element-by-element. The "||" operator is a short-circuiting operator restricted to be used on scalars only. See the doc: https://www.mathworks.com/help/matlab/logical-operations.html.

How do I sum two arrays in MATLAB?

C = A + B adds arrays A and B by adding corresponding elements. If one input is a string array, then plus appends the corresponding elements as strings. The sizes of A and B must be the same or be compatible. If the sizes of A and B are compatible, then the two arrays implicitly expand to match each other.

What does double summation mean?

Double sum is nothing more than sum of a sum. If the elements of the sum have two indices and you want to add the index one by one, then you use double sums. In calculating double summations, here are the steps.


1 Answers

As you probably already know, you should pre-allocate p before the loop:

p = zeros(Z, nR);

this prevents the array p from growing at each iteration, speeding up the loop tremendously.

You could vectorize the whole thing by virtue of bsxfun:

% C ≣ M×N×R array of all products Rmn·J0m
C = bsxfun(@times, Rmn, J0m);

% D ≣ M×N×R×Z array of all products C·sinanz
D = bsxfun(@times, C, permute(sinanz, [3 1 4 2]));

% Sum in M and N directions, and re-order
p = permute(sum(sum(D,1),2), [4 3 1 2]);

but I doubt it's going to be faster; MATLAB (read: BLAS) is pretty fast with 2D matrices, but usually not terribly good with more-D arrays.

I suggest you read about bsxfun; it's also the way to reduce the J0m array to M×R in the way you describe.

Of course, you can get rid of the permutes by properly defining your variables, so let's do a small test in the 'ideal' versions of both the looped and vectorized code:

%% Initialize some variables

% Number of tests
TESTS = 1e4;

% Your dimensions
M = 5;   nR = 4;
N = 2;   Z  = 3;

% Some dummy data
Rmn    = rand(M,N);
sinanz = rand(N,Z);
J0m    = rand(M,nR); % NOTE: J0m doesn't have to be 3D anymore by using bsxfun

%% Test 1: your own version, optimized

% Start test
start = tic;

% Repeat the test a couple of times to get a good average
for ii = 1:TESTS
    p1 = zeros(Z,nR);
    for iR = 1:nR
        p1(:, iR) = sum( bsxfun(@times,Rmn,J0m(:, iR)), 1 )*sinanz;
    end    
end
stop = toc(start);

% Average execution time
fprintf(1, 'Average time of looped version: %f seconds.\n', stop/TESTS);

%% Vectorized version, using 4D arrays: 

% Don't make the permutes part of the test
J0m    = permute(J0m, [1 3 2]);
sinanz = permute(sinanz, [3 1 4 2]);

% Start test
start = tic;

% Repeat the test a couple of times to get a good average
for ii = 1:TESTS
    C  = bsxfun(@times, Rmn, J0m);
    D  = bsxfun(@times, C, sinanz);
    p2 = sum(sum(D,1),2);

end
stop = toc(start);

% Average execution time
fprintf(1, 'Average time of vectorized version: %f seconds.\n', stop/TESTS);

%% Check for equality

maxDifference = max(max(p1 - squeeze(p2)'))

Results:

Average time of looped version    : 0.000054 seconds.
Average time of vectorized version: 0.000023 seconds.
maxDifference =
    4.440892098500626e-016

which seems pretty good! However, using

M = 50;   nR = 40;
N = 20;   Z  = 30;

you get

Average time of looped version    : 0.000708 seconds.
Average time of vectorized version: 0.009835 seconds.
maxDifference =
    8.526512829121202e-014

so the vectorized version just got an order of magnitude slower than the looped variant.

Of course, your mileage may vary, but the take home message is that you should expect this difference to get worse and worse for increasing dimensions.

So, in conclusion:

  • if your dimensions are large, stick with the loop. If they are small, also use the loop - it's just SO much easier on the eyes than the vectorized version :)
  • make sure to pre-allocate the p variable
  • use bsxfun to reduce the memory footprint (but it won't increase the speed much)
like image 150
Rody Oldenhuis Avatar answered Oct 19 '22 18:10

Rody Oldenhuis