I have been searching for this answer for over a week now with no luck. I have so far learnt that stack saves the return address when function nesting or interrupt occurs, but recently I learned that the modern processors use the Link Register to achieve the same goal. After some research, I came to know that the stack was indeed being used to save the return address in older processors. However, It doesn't make sense to me why the modern processors use an entire separate register ( LR ) to save the return address when the older implementation was working? What are the benefits of LR over stack-based implementation?
Thanks in advance!!!
In AArch64 state, the Link Register (LR) stores the return address when a subroutine call is made. It can also be used as a general-purpose register if the return address is stored on the stack.
A link register (LR for short) is a register which holds the address to return to when a subroutine call completes.
The function return address is placed on the stack by the x86 CALL instruction, which stores the current value of the EIP register. Then, the frame pointer that is the previous value of the EBP register is placed on the stack.
R14 (the Link Register) holds the return address from a subroutine entered when you use the branch with link (BL) instruction. It too can be used as a general purpose register when it is not supporting returns from subroutines.
RISC architectures tend to have fewer special instructions or behaviours - and instead standard instructions are used for stack management. This usually means that programs are larger, the CPU itself, simpler, and the compiler is expected to optimise harder
Consider:
int bar(int a)
{
return a * a;
}
void foo()
{
bar(22);
}
foo();
Here, bar()
is a leaf function - one that doesn't go on to make further function calls. Therefore, the return address in LR
will never be overwritten. As a consequence, there's no need for it to be written to the stack at all. This saves and load and a store from/to memory.
foo()
on the other hand will mutate LR
because it makes a function call, so it will need to store the caller's return address on the stack.
Contrast this to an architecture in which making a function call automatically pushes the return address to the stack - this optimisation is not possible.
All version the ARM Procedure Call Standard define a callee saves registers for function call - registers that the caller can expect to be maintained over a function call. If the function is trivial, it again results in no memory access.
In interrupts, timing is often more critical. ARM CPUs have a set of shadow registers which are only accessible in interrupt state. This means that trivial interrupt handlers can be written that require no memory access.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With