Let's say you make some code that has an if statement and condition in that if statement always ends up being true for the entire run of your program, but that it can't be known at compile time that the condition is always true (maybe its specified on the command line)
In this case, you'd expect the cpu to be able to predict this and always choose the right path to take.
Does this mean that with branch prediction that the code is as fast as if there isn't an if statement there at all?
Or, are there real costs still?
I can imagine some costs that probably don't go away...
Are those true? Are there any others?
Some architectures can make correctly predicted branches more or less free. It's called branch folding. Basically what happens is that when the branch predictor in the front end sees a branch that it's encountered before and whose target is known, instead of sending the branch down the pipeline, it can send the instruction at the branch target instead. The front-end still has to see the branch, but the back-end never does, so if the back-end is even slightly behind, the bubbles balance out, and it's as if the branch never existed.
That isn't the whole story, though. A branch is likely as not to be mispredicted the first few times, and if it's a conditional branch, somehow the condition still has to be resolved.
You will have to execute the branch (after resolving all dependencies first), which would take execution resources (ports, queue entries, etc), although any alternative approach (such as conditional moves) would also require an equivalent effort at least. Even predictors must eventually check that they're right.
In addition, most out-of-order CPUs also employ dedicated queues to track branches, see for example the branch order buffer in Haswell - http://www.realworldtech.com/haswell-cpu/3/ . This could fill up and become a limitation if you have too many branches. Some micro-architectures may also impose other restrictions on the bandwidth of updating the branch predictor, ultimately limiting the rate at which you can process these braches (either by blocking execution or commit).
Regarding the code cacheability - yes, naturally you'd have some code for both paths, but again - if the branch has some functional purpose in the program, you can't really help but having this code. The branch may change the way the code is organized (depending on compiler optimizations), but that seems like a secondary effect. If the entire if
condition is redundant, then the same applies for any code bloating you may have in your code.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With