Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Summing array by indices from another cell array

I have an array:

a = [109, 894, 566, 453, 342, 25]

and another cell array of sub-indices of a, denoted as:

subs = { [1,3,4], [2,5,6], [1,3], [3,4], [2,3,4], [6] };    

I want to avoid the for-loop to calculate the following summations via MATLAB:

for i=1:6
    sums_a(i) = sum(a(subs{i}));
end

Is there any fast way such as arrayfun to implement this? Thanks.

like image 382
John Smith Avatar asked May 26 '13 08:05

John Smith


2 Answers

use cellfun

sums_a = cellfun( @(sx) sum( a(sx) ), subs );

PS,
It is best not to use i and j as variable names in Matlab.

like image 192
Shai Avatar answered Nov 15 '22 07:11

Shai


If you're looking for speed, arrayfun can be rather slow. As commented by Andrew Horchler, in latest releases of MATLAB for loops can be pretty fast thanks to JIT acceleration. If you still insist on avoiding the loop, here's a tricky solution without for loops that employs accumarray:

idx = cumsum(cellfun('length', subs));
x = diff(bsxfun(@ge, [0; idx(:)], 1:max(idx)));
x = sum(bsxfun(@times, x', 1:numel(subs)), 2);  %'// Produce subscripts
y = a([subs{:}]);                               % // Obtain values
sums_a = accumarray(x, y);                      % // Accumulate values

This can be written as a (rather long) one-liner actually, but it's split into several lines for clarity.

Explanation

This values to be accumulated are obtained like so:

y = a([subs{:}]);

In your example their corresponding indices should be:

1    1    1    2    2    2    3    3    4    4    5    5    5    6

That is:

  1. The first 3 values from y are accumulated and the result is stored as the first element in the output.
  2. The next 3 values are accumulated and the result is stored as the second element in the output.
  3. The next 2 values are accumulated and the result is stored as the third element in the output.

and so on...

The following lines do the magic to produce such a vector of indices x:

idx = cumsum(cellfun('length', subs));
x = diff(bsxfun(@ge, [0; idx(:)], 1:max(idx)));
x = sum(bsxfun(@times, x', 1:numel(subs)), 2);

Lastly, x and y are fed into accumarray:

sums_a = accumarray(x, y);

and voila.

Benchmark

Here's the benchmarking code:

a = [109,894,566,453,342,25];
subs = {[1,3,4], [2,5,6], [1,3], [3,4], [2,3,4], 6};

% // Solution with arrayfun
tic
for k = 1:1000
    clear sums_a1
    sums_a1 = cellfun( @(subs) sum( a(subs) ), subs );
end
toc

% // Solution with accumarray
tic
for k = 1:1000
    clear sums_a2
    idx = cumsum(cellfun('length', subs));
    x = diff(bsxfun(@ge, [0; idx(:)], 1:max(idx)));
    x = sum(bsxfun(@times, x', 1:numel(subs)), 2);
    sums_a2 = accumarray(x, a([subs{:}]));
end
toc

%'// Solution with for loop
tic
for k = 1:1000
    clear sums_a3
    for n = 1:6
        sums_a3(n) = sum(a(subs{n}));
    end
end
toc

The results on my machine are:

Elapsed time is 0.027746 seconds.
Elapsed time is 0.004228 seconds.
Elapsed time is 0.001959 seconds.

There's an almost tenfold speedup for accumarray vs. arrayfun, but notice that the for loop still beats both.

like image 34
Eitan T Avatar answered Nov 15 '22 07:11

Eitan T