I have a very simply C program that I'm using with GDB to learn more about the stack:
#include<stdlib.h>
#include<stdio.h>
int main(int argc, char* argv[]){
printf("argc is %d", argc);
int i = 0;
for(i; i<argc; i++){
printf("argv at %d is %s", i, argv[i]);
}
return;
}
I compile this program using gcc foo.c -g
and then run it gdb using gdb ./a.out
. Inside gdb, I set a breakpoint at main using b main
, and then show the stack pointer and base pointer:
Reading symbols from ./a.out...done.
(gdb) b main
Breakpoint 1 at 0x40053c: file foo.c, line 5.
(gdb) r
Starting program: /tmp/a.out
Breakpoint 1, main (argc=1, argv=0x7fffffffdf48) at foo.c:5
5 printf("argc is %d", argc);
(gdb) p $sp
$1 = (void *) 0x7fffffffde40
(gdb) p $rbp
$2 = (void *) 0x7fffffffde60
(gdb) x/8x $sp
0x7fffffffde40: 0xffffdf48 0x00007fff 0x00400440 0x00000001
0x7fffffffde50: 0xffffdf40 0x00007fff 0x00000000 0x00000000
(gdb) p &argv
$3 = (char ***) 0x7fffffffde40
(gdb) p &argc
$4 = (int *) 0x7fffffffde4c
So I can see here that argv is pointing to the same address as $sp, the top of the stack, 0x7fffffffde40
. And I also see that argc's address is shortly thereafter at 0x7fffffffde4c
.
However, I'm not sure what the data at 0x7fffffffde48
through 0x7fffffffde4b
is holding. Is it anything important, or just garbage? Why isn't argv directly adjacent to argc on the stack?
Thanks!
In the x86-64 System V ABI, function args are passed in registers. (For links to other ABI docs, and explanations of what an ABI is, see the x86 tag wiki.)
They only have addresses at all because gcc -O0
spills them to the stack. This makes debugging C/C++ easier/more consistent: everything has an address, and the value stored there is always up-to-date after every C statement. However, it makes terribly inefficient asm code. gcc -Og
isn't as strict about storing to memory all the time, so you sometimes get "value optimized out", but it's still "optimized for debugging".
The other goal of gcc -O0
is to compile fast, not to make good code. So don't be surprised that it makes non-optimal decisions about laying out locals on the stack. e.g. it could have only reserved 16 bytes, and put argv
at [rbp-16]
(8 byte aligned), argc at [rbp-8]
(4 byte aligned), and kept the 4B temporary at [rbp-4]
like gcc5.3's actual choice.
The only "reason" for the gap between their actual storage locations is the inner workings of gcc's algorithms for laying out locals, before any extra optimization passes.
To see what really happens when you compile a function, look at asm output (-S
) from -O3 -march=native -fverbose-asm
or something. (Do this with functions that take inputs and return a value, instead of compile-time constant inputs, so they don't optimize away.)
This is the start of main()
, as compiled by gcc 5.3 on the Godbolt Compiler Explorer (with -O0 -fverbose-asm
):
main:
push rbp #
mov rbp, rsp #,
sub rsp, 32 #,
mov DWORD PTR [rbp-20], edi # argc, argc
mov QWORD PTR [rbp-32], rsi # argv, argv
mov eax, DWORD PTR [rbp-20] # tmp92, argc # see how dumb gcc -O0 is: it reloads from memory instead of using the value in edi
...
On function entry, edi
holds argc, and rsi
holds argv. main()
's caller (the libc C runtime startup code) put them there. mov QWORD PTR [rbp-32], rsi
is the instruction that stores argv to the bottom of the space reserved (with sub rsp, 32
). [rbp-32]
happens to be the same address as [rsp]
, but since gcc went to the trouble of making a stack frame (-fomit-frame-pointer
is only the default at -O1
or higher), it addresses locals with offsets from rbp
.
In the 32bit SysV ABI, those args would already be in memory on the stack on function entry, because that ABI unfortunately doesn't use any registers for arg-passing. The extra instructions and latency imposed by those extra store-forwarding round trips required by the legacy ABI are one of the reasons 32bit is slower than 64bit, even apart from the spills/reloads caused by having fewer registers. Some 32bit Windows ABIs use 2 regs for arg-passing, e.g. the __vectorcall
ABI. That's good, because a lot of Windows programs are still distributed as 32bit. (64bit Linux systems usually don't have to run any 32bit code.)
BTW, the ABI standard documents how argc/argv/envp are placed on the stack for your newly-execve(2)
ed process, and that most registers other than %rsp
must be assumed to contain garbage. i.e. the process startup environment for _start
, which is significantly different from what the C runtime code sets up before calling main()
. e.g. on entry to _start
, the top of the stack isn't a return address, so you can't ret
. (You have to make an exit(2)
system call, which is what eventually happens after you return from main()
.)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With