A while back I provided an answer to this question.
Objective: count the number of values in this matrix that are in the [3 6]
range:
A = [2 3 4 5 6 7;
7 6 5 4 3 2]
I came up with 12 different ways to do it:
count = numel(A( A(:)>3 & A(:)<6 )) %# (1)
count = length(A( A(:)>3 & A(:)<6 )) %# (2)
count = nnz( A(:)>3 & A(:)<6 ) %# (3)
count = sum( A(:)>3 & A(:)<6 ) %# (4)
Ac = A(:);
count = numel(A( Ac>3 & Ac<6 )) %# (5,6,7,8)
%# prevents double expansion
%# similar for length(), nnz(), sum(),
%# in the same order as (1)-(4)
count = numel(A( abs(A-(6+3)/2)<3/2 )) %# (9,10,11,12)
%# prevents double comparison and &
%# similar for length(), nnz(), sum()
%# in the same order as (1)-(4)
So, I decided to find out which is fastest. Test code:
A = randi(10, 50);
tic
for ii = 1:1e5
%# method is inserted here
end
toc
results (best of 5 runs, all in seconds):
%# ( 1): 2.981446
%# ( 2): 3.006602
%# ( 3): 3.077083
%# ( 4): 2.619057
%# ( 5): 3.011029
%# ( 6): 2.868021
%# ( 7): 3.149641
%# ( 8): 2.457988
%# ( 9): 1.675575
%# (10): 1.675384
%# (11): 2.442607
%# (12): 1.222510
So it seems that count = sum(( abs(A(:)-(6+3)/2) < (3/2) ));
is the fastest way to go here...
I trade one <
with two divisions, an addition and an abs
, and the execution time is less than half! Does anyone have an explanation for why this is?
The JIT compiler probably replaces the divisions/additions with a single value in memory, but there's still the abs
...Branch misprediction then? Seems silly for something as simple as this...
The A(:)>3 & A(:)<6
expression needs to evaluate two conditions, whereas the abs(A(:)-(6+3)/2) < 3/2)
evaluates one only one.
For very tight compute intensive loops this makes a lot of difference. Even without branch mispredictions, branching in itself is relatively costly. That's why, for instance, loop unrolling works as an optimization technique.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With