Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Understanding the Tomasulo algorithm

So I'm trying to understand the Tomasulo algorithm for out-of-order instruction execution. Here's what I'm getting so far:

  • Instructions are fetched in-order and stored in an instruction queue.

  • Register renaming happens somewhere next...? From what I understand this is to avoid WAR/WAW hazards by giving labels to the registers. Say you have add r1,r2,r3 (1) add r3,r5,r6 (2) You have a WAR hazard and need to ensure that instruction (1) reads the old value of r3 before adding it to r1. So I guess within the instruction queue (?) the hardware renames the registers, i.e. add r1,r2,r3#1 add r3#2,r5,r6 Or something like that.

  • Instructions are issued to reservation stations. From what I understand, each functional unit has its own set of reservation stations. But is it like a queue (FIFO) of instructions for that functional unit to execute when the appropriately tagged operands are available on the common data bus?

  • Since instructions can finish in arbitrary order (out of order), and more instructions can keep coming... Is there like a stage where the common data bus updates the register file before more instructions come in? I've heard a re-order buffer is used, which basically sorts the instructions back in order (this must imply instructions have a tag of some sort) and then the register results are committed back into the register file.

What I'm confused about is the implementation of register renaming, and the structure of the reservation stations.

Thank you for any help.

like image 448
JDS Avatar asked Jun 27 '12 04:06

JDS


People also ask

Is Tomasulo algorithm still used?

Every modern OoO exec CPU uses Tomasulo's algorithm for register renaming. The basic idea of renaming onto more physical registers in a kind of SSA dependency analysis hasn't changed.

What is the difference between Scoreboarding and Tomasulo?

In Tomasulo's algorithm, the control logic is distributed among the reservation stations, whereas in scoreboarding, the scoreboard keeps track of everything.

How does Tomasulo approach increase ILP?

We will discuss the Tomasulo's technique here. The key idea is to allow instructions that have operands available to execute, even if the earlier instructions have not yet executed. For example, consider the following sequence of instructions: DIV.D F0 , F2 , F4. ADD.D F10 , F0 , F8.

How does a reorder buffer help speculation?

The re-order buffer and in-order commit allow us to flush the speculative instructions from the machine when a misprediction is discovered. ROB is another possible source of operands ▪ ROB can provide precise exception in an out-of-order machine ▪ ROB allows us to ignore exceptions on speculative code.


1 Answers

Tomasulo's algorithm is really not tied to any specific hardware and in fact, in real machines, register renaming generally happens before instructions are inserted into the instruction queue. Let's get a couple of fundamental points across. First, the registers as expressed in your program are logical registers. Second, the actual hardware registers are called physical registers. Tomasulo's algorithm is simply a mechanism for mapping logical registers to physical registers. In real machines, there are generally two tables mapping logical registers to physical registers. One in the rename stage and one in the commit stage. There also must be more physical registers than logical ones. It works essentially this way:

  1. For each logical input, look up in the rename stage mapping table to find out which physical register currently holds that logical register value.
  2. For each logical output, find an unused physical register and map the output to it. Update the rename stage mapping table. If no physical register is available, wait until one becomes available.
  3. In the commit stage, when an instruction is committed, overwrite the old logical to physical mappings of the instruction's logical outputs with the new mappings. Free physical registers that were part of the old mappings.
  4. If there is a misprediction (or any kind of squash in the pipeline), the table in the commit stage overwrites the table in the rename stage and everything in the pipeline in-between is blown away.

The part about reservation stations is really a specific implementation of an out-of-order pipeline and essentially comes along with a physical register. There are plenty of machines that don't really have a notion of a reservation station.

Hennessy and Patterson's books on Computer Architecture are the standard textbook on this sort of stuff. I tried to explain this as simply as possible, but there are literally thousands of proposals for optimizing this stuff.

like image 139
Nathan Binkert Avatar answered Nov 04 '22 22:11

Nathan Binkert