I have the following situation
bool user_set_flag;
getFlagFromUser(&user_set_flag);
while(1){
if(user_set_flag){
//do some computation and output
}
//do other computation
}
The variable user_set_flag
is only set once and only once in the code, at the very start, its essentially the user selecting what he wants to do with the program. Say that the user selects user_set_flag = false
then will the compiler compile the code in such a way such that the if(user_set_flag)
statement will only be checked once, or will it be always checked. Can I give the compiler hints like setting the bool to a const?
The reason I ask this is because my application is time-critical and it processes frames as fast as possible. A branch that is always false should be able to be determined at run-time somehow?
Firstly, processors have a capability called branch prediction. After a few runs of the loop, the processor will be able to notice that your if
statement always goes one way. (It can even notice regular patterns, like true false true false
.) It will then speculatively execute that branch, and so long as it able to predict correctly, the extra cost of the if
statement is pretty much eliminated. If you think that the user is more likely to choose true
rather than false
, you can even tell this to the gcc compiler (gcc-specific extension).
However, you did mention in one of your comments that you have a 'much more complicated sequence of bools'. I think it is possible that the processor doesn't have the memory to pattern-match all those jumps -- by the time it comes back to the first if
statement, the knowledge of which way that jump went has been displaced from its memory. But we could help it here...
The compiler has the ability to transform loops and if-statements into what it thinks are more optimal forms. E.g. it could possibly transform your code into the form given by schnaader. This is known as loop unswitching. You can help it along by doing Profile-Guided Optimization (PGO), letting the compiler know where the hotspots are. (Note: In gcc, -funswitch-loops
is only turned on at -O3
.)
You should profile your code at the instruction level (VTune would be a good tool for this) to see if the if-statements are really the bottleneck. If they really are, and if by looking at the generated assembly you think the compiler has got it wrong despite PGO, you can try hoisting out the if-statement yourself. Perhaps templated code would make it more convenient:
template<bool B> void innerLoop() {
for (int i=0; i<10000; i++) {
if (B) {
// some stuff..
} else {
// some other stuff..
}
}
}
if (user_set_flag) innerLoop<true>();
else innerLoop<false>();
I don't think it's at all possible to optimize this any further. The compiler is smart enough to know that the value of user_set_flag
is not going to change during the execution of the loop and will generate the most efficient machine code for that.
This is also somewhat in the realm of second-guessing the compiler. Unless you really really really know what you are doing its better to stick with the simplest solution.
As an exercise try to profile (time) the execution using both if (true)
and if(user_set_flag)
. My guess would be that there is going to be zero difference in execution time.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With