Why ARM gcc push register r3 and lr into stack at the beginning of a function?

Question

I tried to write a simple test code like this(main.c):

main.c
void test(){
}
void main(){
    test();
}

Then I used arm-non-eabi-gcc to compile and objdump to get the assembly code:

arm-none-eabi-gcc -g -fno-defer-pop -fomit-frame-pointer -c main.c
arm-none-eabi-objdump -S main.o > output

The assembly code will push r3 and lr registers, even the function did nothing.

main.o:     file format elf32-littlearm

Disassembly of section .text:

00000000 <test>:
void test(){
}
   0:   e12fff1e        bx      lr

00000004 <main>:
void main(){
   4:   e92d4008        push    {r3, lr}
        test();
   8:   ebfffffe        bl      0 <test>
}
   c:   e8bd4008        pop     {r3, lr}
  10:   e12fff1e        bx      lr

My question is why arm gcc choose to push r3 into stack, even test() function never use it? Does gcc just random choose 1 register to push? If it's for the stack aligned(8 bytes for ARM) requirement, why not just subtract the sp? Thanks.

==================Update==========================

@KemyLand For your answer, I have another example: The source code is:

void test1(){
}
void test(int i){
        test1();
}
void main(){
        test(1);
}

I use the same compile command above, then get the following assembly:

main.o:     file format elf32-littlearm


Disassembly of section .text:

00000000 <test1>:
void test1(){
}
   0:   e12fff1e        bx      lr

00000004 <test>:
void test(int i){
   4:   e52de004        push    {lr}            ; (str lr, [sp, #-4]!)
   8:   e24dd00c        sub     sp, sp, #12
   c:   e58d0004        str     r0, [sp, #4]
        test1();
  10:   ebfffffe        bl      0 <test1>
}
  14:   e28dd00c        add     sp, sp, #12
  18:   e49de004        pop     {lr}            ; (ldr lr, [sp], #4)
  1c:   e12fff1e        bx      lr

00000020 <main>:
void main(){
  20:   e92d4008        push    {r3, lr}
        test(1);
  24:   e3a00001        mov     r0, #1
  28:   ebfffffe        bl      4 <test>
}
  2c:   e8bd4008        pop     {r3, lr}
  30:   e12fff1e        bx      lr

If push {r3, lr} in first example is for use less instructions, why in this function test(), the compiler didn't just using one instruction?

push {r0, lr}

It use 3 instructions instead of 1.

push {lr}
sub sp, sp #12
str r0, [sp, #4]

By the way, why it sub sp with 12, the stack is 8-bytes aligned, it can just sub it with 4 right?

3442 · Accepted Answer

According to the Standard ARM Embedded ABI, r0 through r3 are used to pass the arguments to a function, and the return value thereof, meanwhile lr (a.k.a: r14) is the link register, whose purpose is to hold the return address for a function.

It's obvious that lr must be saved, as otherwise main() would have no way to return to its caller.

It's now notorious to mention that every single ARM instruction takes 32 bits, and as you mentioned, ARM has a call stack alignment requirement of 8 bytes. And, as a bonus, we're using the Embedded ARM ABI, so code size shall be optimized. Thus, it's more efficient to have a single 32-bit instruction both saving lr and aligning the stack by pushing an unused register (r3 is not needed, because test() does not take arguments nor it returns anything), and then pop in a single 32-bit instruction, rather than adding more instructions (and thus, wasting precious memory!) to manipulate the stack pointer.

After all, it's pretty logical to conclude this is just an optimization from GCC.

Why ARM gcc push register r3 and lr into stack at the beginning of a function?

Tags:

c

gcc

assembly

arm

Alan

1 Answers

3442

Recent Activity

Donate For Us

Why ARM gcc push register r3 and lr into stack at the beginning of a function?

Tags:

c

gcc

assembly

arm

Alan

1 Answers

3442

Related questions

Recent Activity

Donate For Us