Most, if not all modern processors utilize a technique called "branch prediction", with which it guesses what way to go in an if-then-else branch.
I have a question considering the scheme. Let's say we have this piece of code, in no specific language:
if(someCondition)
{
// some action
return someValue;
}
// some other action
return someOtherValue;
Logically speaking, that code is equivalent to this code:
if(someCondition)
{
// some action
return someValue;
}
else
{
// some other action
return someOtherValue;
}
The branch predictor would 'predict' the branch in the second example, but what about the first example? Will it guess? What will be loaded into the pipeline? Is there any speed to be gained with either of the examples disregarding the impact of actual code in the blocks?
My guess, it is dependent on the compiler: If statements are implemented (in assembly) using jumps which are only undertaken if the compare flag in the register is set. Now what exactly the assembly instructions will look like depends on the compiler. Unless there is a common way of handling it that every compiler does, which I doubt there is, then this is compiler dependent. In that case, what would happen on the latest Visual Studio C++ and GC++ compilers?
As hexafraction pointed out, the relation between the return values as well as how someCondition
is determined... the branch predictor might not kick in. Let us consider only true and false as return values. For the condition, let us assume both that it is a field which has been predetermined, either inside or outside the function, a local variable, and some arithmetic statement.
To be honest, I do not suspect there is much difference between the case that the condition is a local variable and the case that the field has been predetermined in the same function.
In computer architecture, a branch predictor is a digital circuit that tries to guess which way a branch (e.g., an if–then–else structure) will go before this is known definitively. The purpose of the branch predictor is to improve the flow in the instruction pipeline.
But the dominant CPU designs today rely on dynamic branch prediction. This technique is able to mostly avoid the frontend bubble problem, by predicting the correct address of the next instruction even for branches that aren't fully decoded and executed yet.
It predicts where to fetch from next cycle, and then based on that prediction it predicts where to fetch from the cycle after that, and so on. It doesn't use any information about the decoded instructions, but only past behavior of the branches.
If it guessed wrong it has to undo anything the speculatively executed instructions might have done, clear the pipeline and begin sending down the instructions it was supposed be executing. This results in the same delay as if it hadn't made a prediction at all.
Most likely gcc -O3
would optimise that to a branchless sequence, using a conditional-move instructions. e.g. on x86
# generate someValue in %rax, the x86-64 ABI's return value register
# generate someOtherValue in %rdi, to pick one at random
test someCondition # probably actually test or cmp a register
cmovz %rdi, %rax # copy %rdi to %rax, if the zero flag is set.
ret
cmov has a data dependency on both its inputs and on flags. A conditional branch is a control dependency. Using cmov is often good, unless it's part of one long dependency chain and the branch is fairly predictable.
If there was more work inside the if
blocks, gcc would generate a conditional jump instruction.
# generate someValue in %rax
test someCondition
jz .zero
ret
.zero:
# compute someOtherValue. This work doesn't need to happen at all
# if we don't end up needing it, unlike in the cmov case
mov someOtherValue, %rax
ret
Branch prediction operates on conditional-jump instructions, not on high-level constructs. The same instructions are used to jump back to the top of a loop if the loop condition is true. According to http://agner.org/optimize/ , recent Intel CPUs remember patterns of up to 64 iterations for loops. So loops that run the same number of iterations every time don't have a branch mispredict on the last iteration, if the number of iterations is 64 or less.
So it's not the sequence of instructions that the branch predictor looks at to guess whether a jump will be taken or not-taken. Every separate branch instruction gets an entry in the branch-history-buffer when it's taken. And yes, every compiler has no choice but to use jcc
(Jump on Condition Code) instructions to implement branches/loops.
The default is predict not-taken. If that prediction is correct, the CPU doesn't evict potentially-still-useful info from the cache to make room. See Agner Fog's microarch doc for more low-level details.
On Linux, to see the branch predictor in action, you can use perf stat
:
perf stat /bin/ls # in some big directory
... normal ls output
Performance counter stats for '/bin/ls':
10.403069 task-clock (msec) # 0.094 CPUs utilized
2,255 context-switches # 0.217 M/sec
0 cpu-migrations # 0.000 K/sec
190 page-faults # 0.018 M/sec
16,612,260 cycles # 1.597 GHz
7,843,399 stalled-cycles-frontend # 47.21% frontend cycles idle
5,205,565 stalled-cycles-backend # 31.34% backend cycles idle
20,227,093 instructions # 1.22 insns per cycle
# 0.39 stalled cycles per insn
3,975,777 branches # 382.173 M/sec
########### These two lines ######
55,785 branch-misses # 1.40% of all branches
0.110765717 seconds time elapsed
Intel Sandybridge (i5 2500k), at low-power clock speed, with the default cpufreq governor, doesn't ramp up in clock speed before ls
is done.
There is no difference between those two code samples. The else
is irrelevant because there is no need to branch at the end of the true clause. Even were that not true, the branch at the end of the true clause would not be conditional.
In other words, the code must compile to something like:
Compute test expression
Branch if false to false_label
True action
Return some value
False_label;
False action
Return some other value
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With