Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

R7 and R11 relation with Link Register in ARM architecture (thumb/arm) calling convention

I was looking at a arm assembly code generated by gcc, and I noticed that the GCC compiled a function with the following code:

   0x00010504 <+0>: push    {r7, lr}
   0x00010506 <+2>: sub sp, #24
   0x00010508 <+4>: add r7, sp, #0
   0x0001050a <+6>: str r0, [r7, #4]
=> 0x0001050c <+8>: mov r3, lr
   0x0001050e <+10>:    mov r1, r3
   0x00010510 <+12>:    movw    r0, #1664   ; 0x680
   0x00010514 <+16>:    movt    r0, #1
   0x00010518 <+20>:    blx 0x10378 <printf@plt>
   0x0001051c <+24>:    add.w   r3, r7, #12
   0x00010520 <+28>:    mov r0, r3
   0x00010522 <+30>:    blx 0x10384 <gets@plt>
   0x00010526 <+34>:    mov r3, lr
   0x00010528 <+36>:    mov r1, r3
   0x0001052a <+38>:    movw    r0, #1728   ; 0x6c0
   0x0001052e <+42>:    movt    r0, #1
   0x00010532 <+46>:    blx 0x10378 <printf@plt>
   0x00010536 <+50>:    adds    r7, #24
   0x00010538 <+52>:    mov sp, r7
   0x0001053a <+54>:    pop {r7, pc}

The thing which was interesting for me was that, I see the GCC uses R7 to pop the values to PC instead of LR. I saw similar thing with R11. The compiler push the r11 and LR to the stack and then pop the R11 to the PC. should not LR act as return address instead of R7 or R11. Why does the R7 (which is a frame pointer in Thumb Mode) being used here? If you look at apple ios calling convention it is even different. It uses other registers (e.g. r4 to r7) to PC to return the control. Should not it use LR?

Or I am missing something here?

Another question is that, it looks like that the LR, R11 or R7 values are never an immediate value to the return address. But a pointer to the stack which contain the return address. Is that right?

Another weird thing is that compiler does not do the same thing for function epoilogue. For example it might instead of using pop to PC use bx LR, but Why?

like image 825
Sama Azari Avatar asked Nov 16 '25 06:11

Sama Azari


1 Answers

Well first off they likely want to keep the stack aligned on a 64 bit boundary.

R7 is better than anything greater for a frame pointer as registers r8 to r15 are not supported in most instructions. I would have to look I would assume there are special pc and sp offset load/store instructions so why would r7 be burned at all?

Not sure all you are asking, in thumb you can push lr but pop pc and I think that is equivalent to bx lr, but you have to look it up for each architecture as for some you cannot switch modes with pop. In this case it appears to assume that and not burn the extra instruction with a pop r3 bx r3 kind of thing. And actually to have done that would have likely needed to be two extra instructions pop r7, pop r3, bx r3.

So it may be a case that one compiler is told what architecture is being used and can assume pop pc is safe where another is not so sure. Again have to read the arm architecture docs for various architectures to know the variations on what instructions can be used to change modes and what cant. Perhaps if you walk through various architecture types with gnu it may change the way it returns.

EDIT

unsigned int morefun ( unsigned int, unsigned int );
unsigned int fun ( unsigned int x, unsigned int y )
{
    x+=1;
    return(morefun(x,y+2)+3);
}
arm-none-eabi-gcc -O2 -mthumb -c so.c -o so.o
arm-none-eabi-objdump -D so.o 
00000000 <fun>:
   0:   b510        push    {r4, lr}
   2:   3102        adds    r1, #2
   4:   3001        adds    r0, #1
   6:   f7ff fffe   bl  0 <morefun>
   a:   3003        adds    r0, #3
   c:   bc10        pop {r4}
   e:   bc02        pop {r1}
  10:   4708        bx  r1
  12:   46c0        nop         ; (mov r8, r8)

arm-none-eabi-gcc -O2 -mthumb -mcpu=cortex-m3 -march=armv7-m -c so.c -o so.o
arm-none-eabi-objdump -D so.o 
00000000 <fun>:
   0:   b508        push    {r3, lr}
   2:   3102        adds    r1, #2
   4:   3001        adds    r0, #1
   6:   f7ff fffe   bl  0 <morefun>
   a:   3003        adds    r0, #3
   c:   bd08        pop {r3, pc}
   e:   bf00        nop

just using that march without the mcpu gives the same result (doesnt pop the lr to r1 to bx).

march=armv5t changes it up slightly

00000000 <fun>:
   0:   b510        push    {r4, lr}
   2:   3102        adds    r1, #2
   4:   3001        adds    r0, #1
   6:   f7ff fffe   bl  0 <morefun>
   a:   3003        adds    r0, #3
   c:   bd10        pop {r4, pc}
   e:   46c0        nop         ; (mov r8, r8)

armv4t as expected does the pop and bx thing.

armv6-m gives what armv5t gave.

gcc version 6.1.0 built using --target=arm-none-eabi without any other arm specifier.

So likely as the OP is asking if I understand right they are probably seeing the three instruction pop pop bx rather than a single pop {rx,pc}. Or at least one compiler varies compared to another. Apple IOS was mentioned so it likely defaults to a heavier duty core than a works everywhere type of thing. And their gcc like mine defaults to the work everywhere (including the original ARMv4T) rather than work everywhere but the original. I assume if you add some command line options you will see the gcc compiler behave differently as I have demonstrated.

Note in these examples r3 and r4 are not used, why are they preserving them then? It is likely the first thing I mentioned keeping a 64 bit alignment on the stack. If for the all thumb variants solution if you get an interrupt between the pops then the interrupt handler is dealing with an unaligned stack. Since r4 was throwaway anyway they could have popped r1 and r2 or r2 and r3 and then bx r2 or bx r3 respectively and not had that moment where it was unaligned and saved an instruction. Oh well...

like image 72
old_timer Avatar answered Nov 17 '25 20:11

old_timer



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!