Studying compilers course, I am left wondering why use registers at all. It is often the case that the caller or callee must save the register value and then restore it.
In a way they always end up using the stack anyway. Is creating additional complexity by using registers really worth it?
Excuse my ignorance.
Update: Please, I know that registers are faster than RAM and other types of cache. My main concern is that one has to "save" the value that is in the register and the "restore" it to the register afterwards. In both cases we are accessing some kind of cache. Would it not be better to use cache in the first place?
It has been said that register machines are more efficient than stack machines because register machines can be pipelined for speed while stack machines cannot.
Under non-JIT settings, a stack-based VM will be popping and then pushing the same operands many times, while a register-based VM will simply allocate the right amount of registers and operate on them, which can significantly reduce the amount of operations and CPU time.
Registers are essentially internal CPU memory. So accesses to registers are easier and quicker than any other kind of memory accesses. Save this answer.
Stack machines have higher code density. In contrast to common stack machine instructions which can easily fit in 6 bits or less, register machines require two or three register-number fields per ALU instruction to select operands; the densest register machines average about 16 bits per instruction plus the operands.
In the speed/latency hierarchy, registers are fastest (usually zero cycle latency), L1 cache is next (typically 1 or more cycles of latency), and then it goes downhill rapidly after that. So in general register accesses are "free" whereas there is always some cost involved in memory accesses, even when that access is cached.
Saving and restoring registers typically only happens (a) at the begin/end of a function call or context switch, or (b) when the compiler runs out of registers for temporary variables and needs to "spill" one or more registers back to memory. In general, well-optimised code will keep the majority of frequently accessed ("hot") variables in registers, at least within the innermost loop(s) of a function.
I'd say it's not really an issue with compilers as it is with CPUs. Compilers have to work with the target architecture.
Here's what the other answers are glossing over: it depends on the architecture of the CPU at the level of the actual circuitry. Machine instructions boil down to get data from somewhere, modify the data, load or goto the next instruction.
Think of the problem like a woodworker working on building or repairing a chair for you. His questions will be "Where is the chair", and, "What needs to be done to the chair". He might be able to fix it at your house or he might need to take the chair back to his shop to work on it. Either way will work but depends on how prepared he is to work outside of a fixed location. It could slow him down or it could be his specialty.
Now, back to the CPU.
Regardless of how parallel a CPU may be, like having several adders or instruction decode pipelines, those circuits are located in specific locations on the chip and the data must be loaded into the places where the operation can be performed. The program is responsible for moving the data into and out of those locations. In a stack-based machine, it might provide instructions that modify data directly but it may be doing housekeeping in the microcode. An adder works the same way regardless whether the data came from the stack or from the heap. The difference is in the programming model available to the programmer. Registers are basically a defined place to work on data.
Well, well it seems the answer to this was also in the book (modern compiler implementation in java). The book presents 4 answers:
Accessing RAM is generally MUCH slower than accessing a register both in terms of latency and bandwidth. There are CPUs that have hardware stack of limited size - this allows pushing registers to the stack and popping them back - but they still use registers directly for calculations. Working with a pure stack machine (of which there are many academic examples) is rather difficult too, adding more complexity.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With