<p>In my models, one of the most repeated tasks to be done is counting the number of each element within an array. The counting is from a closed set, so I know there are <code>X</code> types of elements, and all or some of them populate the array, along with zeros that represent 'empty' cells. The array is not sorted in any way, and could by quite long (about 1M elements), and this task is done thousands of times during one simulation (which is also part of hundreds of simulations). The result should be a vector <code>r</code> of size <code>X</code>, so <code>r(k)</code> is the amount of <code>k</code> in the array.</p> <h3>Example:</h3> <p>For <code>X = 9</code>, if I have the following input vector:</p> <pre class="prettyprint"><code>v = [0 7 8 3 0 4 4 5 3 4 4 8 3 0 6 8 5 5 0 3] </code></pre> <p>I would like to get this result:</p> <pre class="prettyprint"><code>r = [0 0 4 4 3 1 1 3 0] </code></pre> <p>Note that I don't want the count of zeros, and that elements that don't appear in the array (like <code>2</code>) have a <code>0</code> in the corresponding position of the result vector (<code>r(2) == 0</code>).</p> <p>What would be the <strong>fastest</strong> way to achieve this goal?</p>

<p><strong>tl;dr:</strong> The fastest method depend on the size of the array. For array smaller than 2<sup>14</sup> method 3 below (<code>accumarray</code>) is faster. For arrays larger than that method 2 below (<code>histcounts</code>) is better.</p> <p>UPDATE: I tested this also with implicit broadcasting, that was introduced in 2016b, and the results are almost equal to the <code>bsxfun</code> approach, with no significant difference in this method (relative to the other methods).</p> <hr> <p>Let's see what are the available methods to perform this task. For the following examples we will assume <code>X</code> has <code>n</code> elements, from 1 to <code>n</code>, and our array of interest is <code>M</code>, which is a column array that can vary in size. Our result vector will be <code>spp</code><sup>1</sup>, such that <code>spp(k)</code> is the number of <code>k</code>s in <code>M</code>. Although I write here about <code>X</code>, there is no explicit implementation of it in the code below, I just define <code>n = 500</code> and <code>X</code> is implicitly <code>1:500</code>.</p> <p></p> <h3> The naive <code>for</code> loop</h3> The most simple and straightforward way to cope this task is by a <code>for</code> loop that iterate over the elements in <code>X</code> and count the number of elements in <code>M</code> that equal to it: <pre class="prettyprint"><code>function spp = loop(M,n) spp = zeros(n,1); for k = 1:size(spp,1); spp(k) = sum(M==k); end end </code></pre> <p>This is off course not so smart, especially if only little group of elements from <code>X</code> is populating <code>M</code>, so we better look first for those that are already in <code>M</code>:</p> <pre class="prettyprint"><code>function spp = uloop(M,n) u = unique(M); % finds which elements to count spp = zeros(n,1); for k = u(u>0).'; spp(k) = sum(M==k); end end </code></pre> <hr> <p>Usually, in MATLAB, it is advisable to take advantage of the built-in functions as much as possible, since most of the times they are much faster. I thought of 5 options to do so:</p> <p></p> <h3>1. The function <code>tabulate</code> </h3> The function <code>tabulate</code> returns a very convenient frequency table that at first sight seem to be the perfect solution for this task: <pre class="prettyprint"><code>function tab = tabi(M) tab = tabulate(M); if tab(1)==0 tab(1,:) = []; end end </code></pre> <p>The only fix to be done is to remove the first row of the table if it counts the <code>0</code> element (it could be that there are no zeros in <code>M</code>).</p> <p></p> <h3>2. The function <code>histcounts</code> </h3> Another option that can be tweaked quite easily to our need it <code>histcounts</code>: <pre class="prettyprint"><code>function spp = histci(M,n) spp = histcounts(M,1:n+1); end </code></pre> <p>here, in order to count all different elements between 1 to <code>n</code> separately, we define the edges to be <code>1:n+1</code>, so every element in <code>X</code> has it's own bin. We could write also <code>histcounts(M(M>0),'BinMethod','integers')</code>, but I already tested it, and it takes more time (though it makes the function independent of <code>n</code>).</p> <p></p> <h3>3. The function <code>accumarray</code> </h3> The next option I'll bring here is the use of the function <code>accumarray</code>: <pre class="prettyprint"><code>function spp = accumi(M) spp = accumarray(M(M>0),1); end </code></pre> <p>here we give the function <code>M(M>0)</code> as input, to skip the zeros, and use <code>1</code> as the <code>vals</code> input to count all unique elements.</p> <p></p> <h3>4. The function <code>bsxfun</code> </h3> We can even use binary operation <code>@eq</code> (i.e. <code>==</code>) to look for all elements from each type: <pre class="prettyprint"><code>function spp = bsxi(M,n) spp = bsxfun(@eq,M,1:n); spp = sum(spp,1); end </code></pre> <p>if we keep the first input <code>M</code> and the second <code>1:n</code> in different dimensions, so one is a column vector the other is a row vector, then the function compares each element in <code>M</code> with each element in <code>1:n</code>, and create a <code>length(M)</code>-by-<code>n</code> logical matrix than we can sum to get the desired result.</p> <p></p> <h3>5. The function <code>ndgrid</code> </h3> Another option, similar to the <code>bsxfun</code>, is to explicitly create the two matrices of all possibilities using the <code>ndgrid</code> function: <pre class="prettyprint"><code>function spp = gridi(M,n) [Mx,nx] = ndgrid(M,1:n); spp = sum(Mx==nx); end </code></pre> <p>then we compare them and sum over columns, to get the final result.</p> <h3>Benchmarking</h3> <p>I have done a little test to find the fastest method from all mentioned above, I defined <code>n = 500</code> for all trails. For some (especially the naive <code>for</code>) there is a great impact of <code>n</code> on the time of execution, but this is not the issue here since we want to test it for a given <code>n</code>.</p> <p>Here are the results: <img src="https://i.stack.imgur.com/AS61T.png" alt="Timing hist"></p> <p>We can notice several things:</p> <ol> <li>Interestingly, there is a shift in the fastest method. For arrays smaller than 2<sup>14</sup><code>accumarray</code> is the fastest. For arrays larger than 2<sup>14</sup><code>histcounts</code> is the fastest.</li> <li>As expected the naive <code>for</code> loops, in both versions are the slowest, but for arrays smaller than 2<sup>8</sup> the "unique & for" option is slower. <code>ndgrid</code> become the slowest in arrays bigger than 2<sup>11</sup>, probably because of the need to store very large matrices in memory.</li> <li>There is some irregularity in the way <code>tabulate</code> works on arrays in size smaller than 2<sup>9</sup>. This result was consistent (with some variation in the pattern) in all the trials I conducted.</li> </ol> <p>(the <code>bsxfun</code> and <code>ndgrid</code> curves are truncated because it makes my computer stuck in higher values, and the trend is quite clear already)</p> <p>Also, notice that the y-axis is in log<sub>10</sub>, so a decrease in unit (like for arrays in size 2<sup>19</sup>, between <code>accumarray</code> and <code>histcounts</code>) means a 10-times faster operation.</p> <p>I'll be glad to hear in the comments for improvements to this test, and if you have another, conceptually different method, you are most welcome to suggest it as an answer.</p> <h3>The code</h3> <p>Here are all the functions wrapped in a timing function:</p> <pre class="prettyprint"><code>function out = timing_hist(N,n) M = randi([0 n],N,1); func_times = {'for','unique & for','tabulate','histcounts','accumarray','bsxfun','ndgrid'; timeit(@() loop(M,n)),... timeit(@() uloop(M,n)),... timeit(@() tabi(M)),... timeit(@() histci(M,n)),... timeit(@() accumi(M)),... timeit(@() bsxi(M,n)),... timeit(@() gridi(M,n))}; out = cell2mat(func_times(2,:)); end function spp = loop(M,n) spp = zeros(n,1); for k = 1:size(spp,1); spp(k) = sum(M==k); end end function spp = uloop(M,n) u = unique(M); spp = zeros(n,1); for k = u(u>0).'; spp(k) = sum(M==k); end end function tab = tabi(M) tab = tabulate(M); if tab(1)==0 tab(1,:) = []; end end function spp = histci(M,n) spp = histcounts(M,1:n+1); end function spp = accumi(M) spp = accumarray(M(M>0),1); end function spp = bsxi(M,n) spp = bsxfun(@eq,M,1:n); spp = sum(spp,1); end function spp = gridi(M,n) [Mx,nx] = ndgrid(M,1:n); spp = sum(Mx==nx); end </code></pre> <p>And here is the script to run this code and produce the graph:</p> <pre class="prettyprint"><code>N = 25; % it is not recommended to run this with N>19 for the `bsxfun` and `ndgrid` functions. func_times = zeros(N,5); for n = 1:N func_times(n,:) = timing_hist(2^n,500); end % plotting: hold on mark = 'xo*^dsp'; for k = 1:size(func_times,2) plot(1:size(func_times,1),log10(func_times(:,k).*1000),['-' mark(k)],... 'MarkerEdgeColor','k','LineWidth',1.5); end hold off xlabel('Log_2(Array size)','FontSize',16) ylabel('Log_{10}(Execution time) (ms)','FontSize',16) legend({'for','unique & for','tabulate','histcounts','accumarray','bsxfun','ndgrid'},... 'Location','NorthWest','FontSize',14) grid on </code></pre> <hr> <p><sup>1</sup><sub>The reason for this weird name comes from my field, Ecology. My models are a cellular-automata, that typically simulate individual organisms in a virtual space (the <code>M</code> above). The individuals are of different species (hence <code>spp</code>) and all together form what is called "ecological community". The "state" of the community is given by the number of individuals from each species, which is the <code>spp</code> vector in this answer. In this models, we first define a species pool (<code>X</code> above) for the individuals to be drawn from, and the community state take into account all species in the species pool, not only those present in <code>M</code></sub></p>

What is the fastest way to count elements in an array?

Tags:

performance

arrays

count

matlab

binning

In my models, one of the most repeated tasks to be done is counting the number of each element within an array. The counting is from a closed set, so I know there are X types of elements, and all or some of them populate the array, along with zeros that represent 'empty' cells. The array is not sorted in any way, and could by quite long (about 1M elements), and this task is done thousands of times during one simulation (which is also part of hundreds of simulations). The result should be a vector r of size X, so r(k) is the amount of k in the array.

Example:

For X = 9, if I have the following input vector:

v = [0 7 8 3 0 4 4 5 3 4 4 8 3 0 6 8 5 5 0 3]

I would like to get this result:

r = [0 0 4 4 3 1 1 3 0]

Note that I don't want the count of zeros, and that elements that don't appear in the array (like 2) have a 0 in the corresponding position of the result vector (r(2) == 0).

What would be the fastest way to achieve this goal?

627

asked Aug 14 '16 11:08

EBH

1 Answers

tl;dr: The fastest method depend on the size of the array. For array smaller than 2¹⁴ method 3 below (accumarray) is faster. For arrays larger than that method 2 below (histcounts) is better.

UPDATE: I tested this also with implicit broadcasting, that was introduced in 2016b, and the results are almost equal to the bsxfun approach, with no significant difference in this method (relative to the other methods).

Let's see what are the available methods to perform this task. For the following examples we will assume X has n elements, from 1 to n, and our array of interest is M, which is a column array that can vary in size. Our result vector will be spp¹, such that spp(k) is the number of ks in M. Although I write here about X, there is no explicit implementation of it in the code below, I just define n = 500 and X is implicitly 1:500.

The naive `for` loop

The most simple and straightforward way to cope this task is by a for loop that iterate over the elements in X and count the number of elements in M that equal to it:

function spp = loop(M,n)
spp = zeros(n,1);
for k = 1:size(spp,1);
    spp(k) = sum(M==k); 
end
end

This is off course not so smart, especially if only little group of elements from X is populating M, so we better look first for those that are already in M:

function spp = uloop(M,n)
u = unique(M); % finds which elements to count
spp = zeros(n,1);
for k = u(u>0).';
    spp(k) = sum(M==k); 
end
end

Usually, in MATLAB, it is advisable to take advantage of the built-in functions as much as possible, since most of the times they are much faster. I thought of 5 options to do so:

1. The function `tabulate`

The function tabulate returns a very convenient frequency table that at first sight seem to be the perfect solution for this task:

function tab = tabi(M)
tab = tabulate(M);
if tab(1)==0
    tab(1,:) = [];
end
end

The only fix to be done is to remove the first row of the table if it counts the 0 element (it could be that there are no zeros in M).

2. The function `histcounts`

Another option that can be tweaked quite easily to our need it histcounts:

function spp = histci(M,n)
spp = histcounts(M,1:n+1);
end

here, in order to count all different elements between 1 to n separately, we define the edges to be 1:n+1, so every element in X has it's own bin. We could write also histcounts(M(M>0),'BinMethod','integers'), but I already tested it, and it takes more time (though it makes the function independent of n).

3. The function `accumarray`

The next option I'll bring here is the use of the function accumarray:

function spp = accumi(M)
spp = accumarray(M(M>0),1);
end

here we give the function M(M>0) as input, to skip the zeros, and use 1 as the vals input to count all unique elements.

4. The function `bsxfun`

We can even use binary operation @eq (i.e. ==) to look for all elements from each type:

function spp = bsxi(M,n)
spp = bsxfun(@eq,M,1:n);
spp = sum(spp,1);
end

if we keep the first input M and the second 1:n in different dimensions, so one is a column vector the other is a row vector, then the function compares each element in M with each element in 1:n, and create a length(M)-by-n logical matrix than we can sum to get the desired result.

5. The function `ndgrid`

Another option, similar to the bsxfun, is to explicitly create the two matrices of all possibilities using the ndgrid function:

function spp = gridi(M,n)
[Mx,nx] = ndgrid(M,1:n);
spp = sum(Mx==nx);
end

then we compare them and sum over columns, to get the final result.

Benchmarking

I have done a little test to find the fastest method from all mentioned above, I defined n = 500 for all trails. For some (especially the naive for) there is a great impact of n on the time of execution, but this is not the issue here since we want to test it for a given n.

Here are the results: Timing hist

We can notice several things:

Interestingly, there is a shift in the fastest method. For arrays smaller than 2¹⁴accumarray is the fastest. For arrays larger than 2¹⁴histcounts is the fastest.
As expected the naive for loops, in both versions are the slowest, but for arrays smaller than 2⁸ the "unique & for" option is slower. ndgrid become the slowest in arrays bigger than 2¹¹, probably because of the need to store very large matrices in memory.
There is some irregularity in the way tabulate works on arrays in size smaller than 2⁹. This result was consistent (with some variation in the pattern) in all the trials I conducted.

(the bsxfun and ndgrid curves are truncated because it makes my computer stuck in higher values, and the trend is quite clear already)

Also, notice that the y-axis is in log₁₀, so a decrease in unit (like for arrays in size 2¹⁹, between accumarray and histcounts) means a 10-times faster operation.

I'll be glad to hear in the comments for improvements to this test, and if you have another, conceptually different method, you are most welcome to suggest it as an answer.

The code

Here are all the functions wrapped in a timing function:

function out = timing_hist(N,n)
M = randi([0 n],N,1);
func_times = {'for','unique & for','tabulate','histcounts','accumarray','bsxfun','ndgrid';
    timeit(@() loop(M,n)),...
    timeit(@() uloop(M,n)),...
    timeit(@() tabi(M)),...
    timeit(@() histci(M,n)),...
    timeit(@() accumi(M)),...
    timeit(@() bsxi(M,n)),...
    timeit(@() gridi(M,n))};
out = cell2mat(func_times(2,:));
end

function spp = loop(M,n)
spp = zeros(n,1);
for k = 1:size(spp,1);
    spp(k) = sum(M==k); 
end
end

function spp = uloop(M,n)
u = unique(M);
spp = zeros(n,1);
for k = u(u>0).';
    spp(k) = sum(M==k); 
end
end

function tab = tabi(M)
tab = tabulate(M);
if tab(1)==0
    tab(1,:) = [];
end
end

function spp = histci(M,n)
spp = histcounts(M,1:n+1);
end

function spp = accumi(M)
spp = accumarray(M(M>0),1);
end

function spp = bsxi(M,n)
spp = bsxfun(@eq,M,1:n);
spp = sum(spp,1);
end

function spp = gridi(M,n)
[Mx,nx] = ndgrid(M,1:n);
spp = sum(Mx==nx);
end

And here is the script to run this code and produce the graph:

N = 25; % it is not recommended to run this with N>19 for the `bsxfun` and `ndgrid` functions.
func_times = zeros(N,5);
for n = 1:N
    func_times(n,:) = timing_hist(2^n,500);
end
% plotting:
hold on
mark = 'xo*^dsp';
for k = 1:size(func_times,2)
    plot(1:size(func_times,1),log10(func_times(:,k).*1000),['-' mark(k)],...
        'MarkerEdgeColor','k','LineWidth',1.5);
end
hold off
xlabel('Log_2(Array size)','FontSize',16)
ylabel('Log_{10}(Execution time) (ms)','FontSize',16)
legend({'for','unique & for','tabulate','histcounts','accumarray','bsxfun','ndgrid'},...
    'Location','NorthWest','FontSize',14)
grid on

¹_{The reason for this weird name comes from my field, Ecology. My models are a cellular-automata, that typically simulate individual organisms in a virtual space (the M above). The individuals are of different species (hence spp) and all together form what is called "ecological community". The "state" of the community is given by the number of individuals from each species, which is the spp vector in this answer. In this models, we first define a species pool (X above) for the individuals to be drawn from, and the community state take into account all species in the species pool, not only those present in M}

138

answered Oct 18 '22 12:10

EBH

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

What is the fastest way to count elements in an array?

Tags:

performance

arrays

count

matlab

binning

Example:

EBH

People also ask

1 Answers

The naive `for` loop

1. The function `tabulate`

2. The function `histcounts`

3. The function `accumarray`

4. The function `bsxfun`

5. The function `ndgrid`

Benchmarking

The code

EBH

Recent Activity

Donate For Us

What is the fastest way to count elements in an array?

Tags:

performance

arrays

count

matlab

binning

Example:

EBH

People also ask

1 Answers

The naive for loop

1. The function tabulate

2. The function histcounts

3. The function accumarray

4. The function bsxfun

5. The function ndgrid

Benchmarking

The code

EBH

Related questions

Recent Activity

Donate For Us

The naive `for` loop

1. The function `tabulate`

2. The function `histcounts`

3. The function `accumarray`

4. The function `bsxfun`

5. The function `ndgrid`