Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is it possible to tell the branch predictor how likely it is to follow the branch?

Just to make it clear, I'm not going for any sort of portability here, so any solutions that will tie me to a certain box is fine.

Basically, I have an if statement that will 99% of the time evaluate to true, and am trying to eke out every last clock of performance, can I issue some sort of compiler command (using GCC 4.1.2 and the x86 ISA, if it matters) to tell the branch predictor that it should cache for that branch?

like image 559
Andy Shulman Avatar asked Dec 05 '09 06:12

Andy Shulman


People also ask

How accurate are branch predictors?

On the SPEC'89 benchmarks, very large bimodal predictors saturate at 93.5% correct, once every branch maps to a unique counter. The predictor table is indexed with the instruction address bits, so that the processor can fetch a prediction for every instruction before the instruction is decoded.

What will happen if the branch predictor will predict the wrong target?

This says whether the branch was recently taken or not. Based on this, the processor fetches the next instruction from the target address / sequential address. If the prediction is wrong, flush the pipeline and also flip prediction. So, every time a wrong prediction is made, the prediction bit is flipped.

How accurate are today's cpus at branch prediction?

The idea is very simple, but the effect is surprisingly good. From the data of Wikipedia, we can know that the branch prediction accuracy of modern CPU can reach more than 90%.


1 Answers

Yes, but it will have no effect. Exceptions are older (obsolete) architectures pre Netburst, and even then it doesn't do anything measurable.

There is an "branch hint" opcode Intel introduced with the Netburst architecture, and a default static branch prediction for cold jumps (backward predicted taken, forward predicted non taken) on some older architectures. GCC implements this with the __builtin_expect (x, prediction), where prediction is typically 0 or 1. The opcode emitted by the compiler is ignored on all newer processor architecures (>= Core 2). The small corner case where this actually does something is the case of a cold jump on the old Netburst architecture. Intel recommends now not to use the static branch hints, probably because they consider the increase of the code size more harmful than the possible marginal speed up.

Besides the useless branch hint for the predictor, __builtin_expect has its use, the compiler may reorder the code to improve cache usage or save memory.

There are multiple reasons it doesn't work as expected.

  • The processor can predict small loops (n<64) perfectly.
  • The processor can predict small repeating patterns (n~7) perfectly.
  • The processor itself can estimate the probability of a branch during runtime better than the compiler/programmer during compile time.
  • The predictability (= probability a branch will get predicted correctly) of a branch is far more important than the probability that the branch is taken. Unfortunately, this is highly architecture-dependent, and predicting the predictability of branch is notoriously hard.

Read more about the inner works of the branch prediction at Agner Fogs manuals. See also the gcc mailing list.

like image 197
Gunther Piez Avatar answered Sep 28 '22 05:09

Gunther Piez