I am writing a critical piece of code with roughly the following logic
if(expression is true){
//do something with extremely low latency before the nuke blows up. This branch is entered rarely, but it is the most important case
}else{
//do unimportant thing that doesnt really matter
}
I am thinking to use likely()
macro around the expression, so when it hits the important branch, I get minimum latency.
My question is that the usage is really opposite of the macro name suggest because I am picking the unlikely branch to be pre-fetch, i.e., the important branch is unlikely to happen but it is the most critical thing when it happens.
Is there a clear downside of doing this in terms of performance?
Yes. You are tricking the compiler by tagging the unlikely-but-must-be-fast branch as if it were the likely branch, in hopes that the compiler will make it faster.
There is a clear downside in doing that—if you don't write a good comment that explains what you're doing and why, some maintainer (possibly you yourself) in six months is almost guaranteed to say, "Hey, looks like he put the likely on the wrong branch" and "fix" it.
There is also a much less likely but still possible downside, that some version of some compiler that you use now or in the future will do different things than you're expecting with the likely macro, and those different things will not be what you wanted to trick the compiler into doing, and you'll end up with code that, every time through the loop, spends $100K speculatively getting 90% of the way through reactor shutdown before undoing it.
It's absolutely opposite of the traditional use of __builtin_expect(x, 1), which is used in the sense of the macro:
#define likely(x) __builtin_expect(x, 1)
which I would personally consider to be bad form (since you're cryptically marking the unlikely path as likely for a performance gain). However, you still could mark this optimization, as __builtin_expect(x) makes no assumptions about your needs by claiming a path "likey" - that's just the standard use.To do what you want, I'd suggest:
#define optimize_path(x) __builtin_expect(x, 1)
which will do the same thing, but rather than making the code accuse the unlikely path of being likely, you're now making the code describe what you're really attempting -- to optimize the critical path.
However, I should say that if you're planning on timing a nuke - you should not only be hand checking (and timing) the compiled assembly so that the timing is correct, but you should also be using a RTOS. A branch misprediction will have an extraordinarily insignificant effect, to the point that it's almost unnecessary here, since you can compensate for the "1 in a million" event by simply having a faster processor or correctly timing the delay for a mispredict. What does affect modern computer timings is OS preemption and scheduling. If you need something to happen on a very discrete timescale, you should be scheduling them for real-time, not psuedo-real time that most general purpose operating systems have. Branch misprediction is generally hundreds of times smaller than the delay that can occur from not using RTOS in an RT situation. Typically if you believe branch misprediction might be a problem, you remove the branch from time-sensitive issue, as the branch predictor typically has a state that is complex and out of your control. Macro's like "likely" and "unlikely" are for blocks of code that can be hit from various areas, with various branch prediction states, and most importantly are used very frequently. The high frequency of hitting these branches leads to a tangible increase in performance for applications that use it (like the Linux Kernel). If you only hit the branch once, you might get a 1 nanosecond performance boost in some cases, but if an application is ever that time critical, there are other things you can do to help yourself to much larger increases in performance.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With