Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Accelerating Iterations- MATLAB

Consider 2 Vectors A = [20000000 x 1] and B = [20000000 x 1 ]

I would need to find the sum of all A corresponding to every unique element of B.

Although this looks really easy, this is taking forever in MATLAB.

Currently, I am using

u = unique(B);
length_u = length(u);
C = zeros(length_u,1);

for i = 1:length_u
   C(i,1) = sum(A(B==u(i)));
end

Is there anyway to make it run faster? I tried splitting the loop and running 2 parfor loops using the parallel computing toolbox(because I have only 2 cores). Still takes hours.

P.S: Yes, I should get a better computer.

like image 937
enigmae Avatar asked Jun 26 '14 07:06

enigmae


Video Answer


2 Answers

You must see this answer first.
If you must, you can use a combination of histc and accumarray

A = randi( 500, 1, 100000 );
B = randi( 500, 1, 100000 );

ub = unique( B );

[ignore idx] = histc( B, [ub-.5 ub(end)+.5] );
C = accumarray( idx', A' )';

see a toy comparison to the naive for-loop implementation on ideone.

How does it work?

We use the second outout of histc to map elements of B (and later A) to the bins defined by the elements of ub (the unique elements of B).
accumarray is then used to sum all entries of A accorind to the mapping defined by idx.
Note: I assume the unique elements of B are at least 0.5 apart.

like image 96
Shai Avatar answered Sep 21 '22 17:09

Shai


If B contains only integers, you can do it easily in one line, using the fact that sparse adds elements with the same index:

C = nonzeros(sparse(B,1,A));
like image 45
Luis Mendo Avatar answered Sep 18 '22 17:09

Luis Mendo