I have a relatively large matrix NxN (N~20,000) and a Nx1 vector identifying the indices that must be grouped together.
I want to sum together parts of the matrix, which in principle can have a different number of elements and non-adjacent elements. I quickly wrote a double for-loop that works correctly but of course it is inefficient. The profiler identified these loops as one of the bottlenecks in my code.
I tried to find a smart vectorization method to solve the problem. I explored the arrayfun
, cellfun
, and bsxfun
functions, and looked for solutions to similar problems... but I haven't found a final solution yet.
This is the test code with the two for-loops:
M=rand(10); % test matrix
idxM=[1 2 2 3 4 4 4 1 4 2]; % each element indicates to which group each row/column of M belongs
nT=size(M,1);
sumM=zeros(max(idxM),max(idxM));
for t1=1:nT
for t2=1:nT
sumM(t1,t2) = sum(sum(M(idxM==t1,idxM==t2)));
end
end
You can use accumarray
as follows:
nT = size(M,1); % or nT = max(idxM)
ind = bsxfun(@plus, idxM(:), (idxM(:).'-1)*nT); % create linear indices for grouping
sumM = accumarray(ind(:), M(:), [nT^2 1]); % compute sum of each group
sumM = reshape(sumM, [nT nT]); % reshape obtain the final result
A solution using cumsum
and diff
.
[s,is] = sort(idxM);
sumM = M(is,is);
idx = [diff(s)~=0 ,true];
CS = cumsum(sumM);
CS = cumsum(CS(idx,:),2);
n=sum(idx);
result = diff([zeros(n,1) diff([zeros(1,n); CS(:,idx)])],1,2);
sumM (:)=0;
sumM (s(idx),s(idx))=result;
I'd like to point those who are interested to this answer provided on another forum
S=sparse(1:N,idxM,1);
sumM=S.'*(M*S);
Credits (and useful discussion):
https://www.mathworks.com/matlabcentral/answers/407634-how-to-sum-parts-of-a-matrix-of-different-sizes-without-using-for-loops
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With