Consider 2 Vectors A = [20000000 x 1] and B = [20000000 x 1 ]
I would need to find the sum of all A corresponding to every unique element of B.
Although this looks really easy, this is taking forever in MATLAB.
Currently, I am using
u = unique(B);
length_u = length(u);
C = zeros(length_u,1);
for i = 1:length_u
C(i,1) = sum(A(B==u(i)));
end
Is there anyway to make it run faster? I tried splitting the loop and running 2 parfor loops using the parallel computing toolbox(because I have only 2 cores). Still takes hours.
P.S: Yes, I should get a better computer.
You must see this answer first.
If you must, you can use a combination of histc and accumarray
A = randi( 500, 1, 100000 );
B = randi( 500, 1, 100000 );
ub = unique( B );
[ignore idx] = histc( B, [ub-.5 ub(end)+.5] );
C = accumarray( idx', A' )';
see a toy comparison to the naive for-loop implementation on ideone.
We use the second outout of histc to map elements of B (and later A) to the bins defined by the elements of ub (the unique elements of B).accumarray is then used to sum all entries of A accorind to the mapping defined by idx.
Note: I assume the unique elements of B are at least 0.5 apart.
If B contains only integers, you can do it easily in one line, using the fact that sparse adds elements with the same index:
C = nonzeros(sparse(B,1,A));
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With