I'm reading around that branch misprediction can be a hot bottleneck for the performance of an application. As I can see, people often show assembly code that unveil the problem and state that programmers usually can predict where a branch could go the most of the times and avoid branch mispredictons.
My questions are:
Is it possible to avoid branch mispredictions using some high level programming technique (i.e. no assembly)?
What should I keep in mind to produce branch-friendly code in a high level programming language (I'm mostly interested in C and C++)?
Code examples and benchmarks are welcome.
Branching is the practice of creating copies of programs or objects in development to work in parallel versions, retaining the original and working on the branch or making different changes to each.
Branchless programming is a programming technique that eliminates the branches (if, switch, and other conditional statements) from the program. Although this is not much relevant these days with extremely powerful systems and usage of interpreted languages( especially dynamic typed ones).
Branching statements allow the flow of execution to jump to a different part of the program. The common branching statements used within other control structures include: break , continue , return , and goto .
When an "Algorithm" makes a choice to do one of two (or more things) this is called branching. The most common programming "statement" used to branch is the "IF" statement.
people often ... and state that programmers usually can predict where a branch could go
(*) Experienced programmers often remind that human programmers are very bad at predicting that.
1- Is it possible to avoid branch mispredictions using some high level programming technique (i.e. no assembly)?
Not in standard c++ or c. At least not for a single branch. What you can do is minimize the depth of your dependency chains so that branch mis-prediction would not have any effect. Modern cpus will execute both code paths of a branch and drop the one that wasn't chosen. There's a limit to this however, which is why branch prediction only matters in deep dependency chains.
Some compilers provide extension for suggesting the prediction manually such as __builtin_expect in gcc. Here is a stackoverflow question about it. Even better, some compilers (such as gcc) support profiling the code and automatically detect the optimal predictions. It's smart to use profiling rather than manual work because of (*).
2- What should I keep in mind to produce branch-friendly code in a high level programming language (I'm mostly interested in C and C++)?
Primarily, you should keep in mind that branch mis-prediction is only going to affect you in the most performance critical part of your program and not to worry about it until you've measured and found a problem.
But what can I do when some profiler (valgrind, VTune, ...) tells that on line n of foo.cpp I got a branch prediction penalty?
Lundin gave very sensible advice
Order of 2. and 3. may be switched. Optimizing your code by hand is a lot of work. On the other hand, gathering the profiling data can be difficult for some programs as well.
(**) One way to do that is transform your loops by for example unrolling them. You can also let the optimizer do it automatically. You must measure though, because unrolling will affect the way you interact with the cache and may well end up being a pessimization.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With