Return stack buffer?

Question

As I understood, Return Stack Buffer only supports 4 to 16 entries (from wiki: http://en.wikipedia.org/wiki/Branch_predictor#Prediction_of_function_returns) and is not pair of key-value(based on indexing by position of ret instruction). Is it true? What happens to RSB when context switch happens?

Suppose we got into 50 functions which aren't returned in a CPU with return stack buffer length of 16, what happens after it? Does it mean all predictions fail? Can you illustrate it? Is this scenario the same in recursive function calls?

Lewis Kelsey · Accepted Answer

The BPU can contain its own RAS predictor, which pushes assumed call NLIPs (IP of the following instruction) onto the RAS stack when it predicts a call type in the BTB. The next return it predicts in the BTB will use the top of the RAS as the predicted address (like how when it predicts a regular indirect branch, a parallel hit in the ITA will outrank the target address in the BTB).

The BAC will verify / override these return target predictions at decode by pushing the NLIP of every call instruction to its own RSB, the next return address's prediction will be compared with this address. If incorrect, BAC will issue a BAclear and resteer the next IP logic at the start of the pipeline to the correct return address (which might turn out to be wrong id the RSB is corrupted). It probably overwrites the RAS predictor stack with the BAC RSB state.

In one implementation, the BAC provides the TOS pointer with every branch prediction it verifies, along with the fall through address. Once a branch is executed and the real result is known, if a misprediction occurs, the RSB TOS is restored. More efficient I think is having an architectural RSB at retirement, which is copied into the BAC RSB and RAS predictor upon a pipeline flush / misprediction. That prevents restoring to a corrupt RSB.

The RAS predictor is likely to be a circular stack which may or may not have overflow and underflow checks and guarantees depending on the implementation. A new prediction likely overrides the oldest prediction when the stack is full so that it is always up to date (rather than preventing it from being added when full, which would mean keeping a counter as to how many call / returns its unable to make prediction for). As for an underflow, it likely refuses to make a prediction, and instead it uses the ITA to make the prediction. If the RSB underflows, it probably doesn't override the prediction made by the RAS predictor.

A hardware interrupt for a context switch results in the pipeline being cleared when the final uop of a macroop executes. The RSB is likely restored to an architectural state for continuation after the interrupt. It is likely possible for the predictor RAS / BAC RSB to be flushed in microcode and if it becomes corrupted it eventually uncorrupts itself.

Return stack buffer?

Tags:

cpu-architecture

branch-prediction

x86

cpu

micro-architecture

user683595

1 Answers

Lewis Kelsey

Recent Activity

Donate For Us

Return stack buffer?

Tags:

cpu-architecture

branch-prediction

x86

cpu

micro-architecture

user683595

1 Answers

Lewis Kelsey

Related questions

Recent Activity

Donate For Us