Alright, so I know that if a particular conditional branch has a condition that takes time to compute (memory access, for instance), the CPU assumes a condition result and speculatively executes along that path. However, what would happen if, along that path, yet another slow conditional branch pops up (assuming, of course, that the first condition hasn't been resolved yet and the CPU can't just commit the changes)? Does the CPU just speculate inside the speculation? What happens if the last condition is mispredicted but the first wasn't? Does it just rollback all the way?
I'm talking about something like this:
if (value_in_memory == y){
// computations
if (another_val_memory == x){
//computations
}
}
The reason it’s “speculative” execution is that the CPU might be wrong. If it is, the system loads the appropriate data and executes those instructions instead. But branch predictors aren’t wrong very often; accuracy rates are typically above 95 percent.
Overview. Modern pipelined microprocessors use speculative execution to reduce the cost of conditional branch instructions using schemes that predict the execution path of a program based on the history of branch executions. In order to improve performance and utilization of computer resources, instructions can be scheduled at a time...
The reason its “speculative” execution, of course, is because the CPU might be wrong. If it is, the system loads the appropriate data and executes those instructions instead. But branch predictors aren’t wrong very often; accuracy rates are typically above 95 percent.
Speculative multithreading is a special case of speculative execution. Modern pipelined microprocessors use speculative execution to reduce the cost of conditional branch instructions using schemes that predict the execution path of a program based on the history of branch executions.
Speculative execution is the regular state of execution, not a special mode that an out of order CPU enters when it sees a branch and then leaves when the branch is no longer in flight.
This is easier to see if you consider that it's not just branches that can fault, but many instructions, including those that access memory, have restrictions on their input values, etc. So any substantial out of order execution implies constant speculation, and CPUs are built around that idea.
So "nested branches" doesn't end up being special in that sense.
Now, modern CPUs have a variety of methods for quick branch misprediction recovery, faster than recovery from other types of faults1. For example they may snapshot the state of the register mapping at some branches, to allow recovery to start before the branch is at the head of the reorder buffer. Since it is not always feasible to snapshot at all branches, there might be complicated heuristics involved to decide where to take snapshots.
I mention this last part because it is one way in which nested branches might matter: when there are lots of branches in flight, you might hit some microarchitectural limits related to the tracking of these branches for recovery purposes. For more details, you can look through patents for "branch order buffer" (for Intel techniques, but there are no doubt others).
1 The basic recovery method is keep executing until the faulting instruction is the next to retire, and then throw away all younger instructions. In the context of branch mispredictions, this means you could actually suffer two or more mispredictions only the oldest of which actually takes effect: e.g., a younger branch mispredicts, and while executing up to that branch (at which point recovery can occur), another mispredict occurs, so the younger one ends up getting discarded.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With