How does a return address register work in a processor architecture that doesn't store the return address on the stack?

Question

I'm trying to figure out how an architecture that stores the return address of a call in a register (RR) would work (as opposed to pushing and popping the return address on the stack).

Wouldn't the return address register be overwritten every time a nested call is made (therefore making a return impossible past one return)? Reading my homework problem, I'm supposed to modify an assembly program to use an RR register that stores the return address of calls instead of pushing and popping it on the stack. I've searched for how that would work, but either there's nothing out there, the information is well hidden, or my googling skills aren't that great.

I'm not asking for the problem to be solved, but I would like to know how storing the return address in one register is feasible with multiple calls in a program without subsequently storing the register value on the stack (which would defeat the point of the exercise).

Thanks for any help.

Peter Cordes · Accepted Answer

Yes, on ISAs that use a "link register" to pass a return address, non-leaf function have to save/restore their return address, very similar to how they'd save a call-preserved register they wanted to use inside the function. i.e. normally on the callstack.

Many RISC ISAs don't have push/pop instructions, but the same operation can be done with multiple instructions. e.g. subtract from the stack pointer to make space, then save some registers including LR on function entry. Then before returning, reload registers and add to the stack pointer to restore the caller's value of SP and whatever other registers.

Leaf functions (that don't make any function calls) can just leave that register alone so the return address is still there when they ret (or whatever the return instruction is called, e.g. MIPS jr $ra - jump-register to the return-address register).

Look at compiler output for example:

void external();

void foo(int *p) {
    external();
    *p = 0;    // defeat tail-call optimization
}

compiled for MIPS by GCC 5.4 -O2 -fno-delayed-branch on the Godbolt compiler explorer

foo(int*):
        addiu   $sp,$sp,-32        # reserve 32 bytes of stack space (MIPS calling convention I think guarantees some "shadow space" for callees)
        sw      $31,28($sp)        # $31 is MIPS's $ra return address reg
        sw      $16,24($sp)        # $16 is a call-preserved register
        move    $16,$4             # save p for later use
        jal     external
        nop                 # branch-delay slot

        lw      $31,28($sp)        # reload return address
        sw      $0,0($16)          # *p = 0
        lw      $16,24($sp)        # restore caller's $16
        addiu   $sp,$sp,32         # restore stack
        j       $31                # jump to return address
        nop                 # branch delay slot

It's not generally necessary for a function to return with the return address in the same register it was in before, depending on what kind of return instruction the ISA uses. It's typical, and maybe help branch prediction on some microarchitectures, though.

32-bit ARM is fun, and has microcoded push / pop instructions that take a bitfield of registers to push and pop. So it's typical to push {r4, lr} on function entry and pop {r4, pc} as the return instruction. (ARM has the program-counter as one of the 16 architectural general-purpose registers. Writing to it is a jump.) Pushing r4 along with the Link Register lr keeps the stack aligned, and gives you a call-preserved register to play with.

rcgldr · Answer

Assuming recursion is not required, you could invent a convention that the link (return register) is stored in a different register, depending on level of nesting.

Note that IBM mainframes in classic mode do not have a stack. Instead, the caller provides a save area, pointed to by R13, then when the call is made, R14 contains return address, and R15 is base address of the function called. For recursion, each caller allocates a new save area from the heap before making a call. The convention is for callee to store R13 in it's proper location in the save area, creating a link chain of save areas, called a "linkage stack". When returning, the callee would need to free its allocated save area just before returning.

How does a return address register work in a processor architecture that doesn't store the return address on the stack?

Tags:

cpu-architecture

assembly

ElSnowman

2 Answers

Peter Cordes

rcgldr

Recent Activity

Donate For Us

How does a return address register work in a processor architecture that doesn't store the return address on the stack?

Tags:

cpu-architecture

assembly

ElSnowman

2 Answers

Peter Cordes

rcgldr

Related questions

Recent Activity

Donate For Us