Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Matlab performance: comparison slower than arithmetic

A while back I provided an answer to this question.

Objective: count the number of values in this matrix that are in the [3 6] range:

A = [2 3 4 5 6 7;
     7 6 5 4 3 2]

I came up with 12 different ways to do it:

count = numel(A( A(:)>3 & A(:)<6 ))      %# (1)
count = length(A( A(:)>3 & A(:)<6 ))     %# (2)
count = nnz( A(:)>3 & A(:)<6 )           %# (3)
count = sum( A(:)>3 & A(:)<6 )           %# (4)

Ac = A(:);
count = numel(A( Ac>3 & Ac<6 ))          %# (5,6,7,8)
%# prevents double expansion
%# similar for length(), nnz(), sum(),
%# in the same order as (1)-(4)

count = numel(A( abs(A-(6+3)/2)<3/2 ))   %# (9,10,11,12)
%# prevents double comparison and & 
%# similar for length(), nnz(), sum()
%# in the same order as (1)-(4)

So, I decided to find out which is fastest. Test code:

A = randi(10, 50);
tic
for ii = 1:1e5

    %# method is inserted here

end
toc

results (best of 5 runs, all in seconds):

%# ( 1): 2.981446
%# ( 2): 3.006602
%# ( 3): 3.077083
%# ( 4): 2.619057
%# ( 5): 3.011029
%# ( 6): 2.868021
%# ( 7): 3.149641
%# ( 8): 2.457988
%# ( 9): 1.675575
%# (10): 1.675384
%# (11): 2.442607
%# (12): 1.222510

So it seems that count = sum(( abs(A(:)-(6+3)/2) < (3/2) )); is the fastest way to go here...

I trade one < with two divisions, an addition and an abs, and the execution time is less than half! Does anyone have an explanation for why this is?

The JIT compiler probably replaces the divisions/additions with a single value in memory, but there's still the abs...Branch misprediction then? Seems silly for something as simple as this...

like image 293
Rody Oldenhuis Avatar asked Aug 27 '12 06:08

Rody Oldenhuis


1 Answers

The A(:)>3 & A(:)<6 expression needs to evaluate two conditions, whereas the abs(A(:)-(6+3)/2) < 3/2) evaluates one only one.

For very tight compute intensive loops this makes a lot of difference. Even without branch mispredictions, branching in itself is relatively costly. That's why, for instance, loop unrolling works as an optimization technique.

like image 115
IvoTops Avatar answered Nov 05 '22 16:11

IvoTops