MATLAB's built-in function accumarray
accepts a function fun
as a fourth argument.
A = accumarray(subs,val,sz,fun);
This applies fun
to each subset of elements in val
that have identical subscripts in subs
. The documentation however states:
If the subscripts in
subs
are not sorted with respect to their linear indices,fun
should not depend on the order of the values in its input data.
How can we implement a stable version of accumarray
, which doesn't have this limitation, but will guarantee that the subsets adopt the same order as given by val
?
Example:
subs = [1:10,1:10];
val = 1:20;
accumarray(subs(:), val(:), [], @(x)x(end)).'
The expected output of this would be 11:20
if accumarray
were stable. In fact the output is:
ans =
11 12 13 14 5 6 7 18 19 20
Our implementation should yield:
accumarrayStable(subs(:), val(:), [], @(x)x(end)).'`
ans =
11 12 13 14 15 16 17 18 19 20
example. B = accumarray( ind , data ) sums groups of data by accumulating elements of a vector data according to the groups specified in ind . The sum is then computed over each group. The values in ind define both the group the data belongs to and the index into the output array B where each group sum is stored.
B = cumsum( A , dim ) returns the cumulative sum of the elements along dimension dim . For example, if A is a matrix, then cumsum(A,2) returns the cumulative sum of each row. B = cumsum(___, direction ) optionally specifies the direction using any of the previous syntaxes.
ind = sub2ind( sz , row , col ) returns the linear indices ind corresponding to the row and column subscripts in row and col for a matrix of size sz . Here sz is a vector with two elements, where sz(1) specifies the number of rows and sz(2) specifies the number of columns.
S = sum( A , 'all' ) computes the sum of all elements of A . This syntax is valid for MATLAB® versions R2018b and later.
We can use sortrows
as a preprocessing step to sort the indices and corresponding values first, as its documentation states:
SORTROWS
uses a stable version of quicksort.
As the subscripts in subs
should be sorted with respect to their linear indices, we need to sort them in reverse lexicographic order. This can be achieved by flipping the column ordering before and after using sortrows
.
This gives us the following code for a stable version of accumarray
:
function A = accumarrayStable(subs, val, varargin)
[subs(:,end:-1:1), I] = sortrows(subs(:,end:-1:1));
A = accumarray(subs, val(I), varargin{:});
Alternative:
As suggested by Luis Mendo, instead of sortrows
one could also generate the linear indices from the subscripts and use sort
instead.
function A = accumarrayStable(subs, val, varargin)
if numel(varargin)>0 && ~isempty(varargin{1})
sz = varargin{1};
else
sz = max(subs,[],1);
end
[~, I] = sort(subs*cumprod([1,sz(1:end-1)]).');
A = accumarray(subs(I,:), val(I), sz, varargin{:});
Note that we should be using 1+(subs-1)*cumprod([1,sz(1:end-1)]).'
for the conversion to linear indices. We leave out the +1
and -1
as the result of sort
will still be the same; which saves us a few cycles.
Which one of the above solutions is faster will depend on your machine and MATLAB version. You could test for example via:
A = randi(10, 1e4, 5);
timeit(@()accumarrayStable(A(:,1:end-1), A(:,end), [], @(x)x(1))
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With