Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why put the most likely condition first in an `if-else` statement?

I've heard this often enough to actually question it - many people say that in an if-else statement, one should put the condition that is most likely to be true first. So, if condition is likely to be false most of the time, put !condition in the if statement, otherwise use condition. Some contrived illustration of what I mean:

if (likely) {
    // do this
} else {
    // do that
}

if (!unlikely) {
    // do this
} else {
    // do that
}

Some people say that this is more efficient - perhaps due to branch prediction or some other optimization, I've never actually enquired when the topic has been breached - but as far as I can tell there will always be one test, and both paths will result in a jump.

So my question is - is there a convincing reason (where a "convincing reason" may be a tiny efficiency gain) why the condition that is most likely to be true should come first in an if-else statement?

like image 793
Sean Kelleher Avatar asked Sep 29 '22 08:09

Sean Kelleher


1 Answers

There are 2 reasons why order may matter:

  1. multiple branches of several if/else_if/else_if/else statement have different probability due to known distribution of data

Sample - sort apples, most are good yellow once (90%), but some are orange (5%) and some are of other colors (5%).

 if (apple.isYellow) {...}
 else if (apple.isOrange) {....}
 else {...}   

vs.

 if (!apple.isYellow && !apple.isOrange) {...}   
 else if (apple.isOrange ) {....}
 else  {...}

In first sample 90% of apples checked with just one if check and 10% will hit 2, but in second sample only 5% hit one check and 95% hit two.

So if you know that that is significant difference between chances when one branch will be used it may be useful to move it up to be the first condition.

Note that your sample of single if makes no difference on that level.

  1. low level CPU optimizations that may favor one of the branches (also this is more about incoming data to consistently hitting the same branch of condition).

Earlier/simpler CPUs may need to clear command parsing pipeline if conditional jump is performed compared to case where code executed sequentially. So this may be a reason for such suggestion.

Sample (fake assembly):

    IF R1 > R2 JUMP ElseBranch
    ADD R2, R3 -> R4  // CPU decodes this command WHILE executing previous IF
    ....
    JUMP EndCondition  
ElseBranch:    
    ADD 4, R3 -> R4   // CPU will have to decodes this command AFTER
                      // and drop results of parsing  ADD R2, R3 -> R4  
    ....
EndCondition: 
    ....

Modern CPUs should not have the problem as they would parse commands for both branches. They even have branch prediction logic for conditions. So if condition mostly resolved one way CPU will assume that condition will be resolved that particular way and start executing code in that branch before check is finished. To my knowledge it does not matter on current CPUs whether it is first or alternative branch of condition. Check out Why is it faster to process a sorted array than an unsorted array? for good info on that.

like image 141
Alexei Levenkov Avatar answered Oct 18 '22 00:10

Alexei Levenkov