Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is the space between argv and argc on my stack?

I have a very simply C program that I'm using with GDB to learn more about the stack:

#include<stdlib.h>
#include<stdio.h>

int main(int argc, char* argv[]){
  printf("argc is %d", argc);
  int i = 0;
  for(i; i<argc; i++){
    printf("argv at %d is %s", i, argv[i]);
  }
  return;
}

I compile this program using gcc foo.c -g and then run it gdb using gdb ./a.out. Inside gdb, I set a breakpoint at main using b main, and then show the stack pointer and base pointer:

Reading symbols from ./a.out...done.
(gdb) b main
Breakpoint 1 at 0x40053c: file foo.c, line 5.
(gdb) r
Starting program: /tmp/a.out 

Breakpoint 1, main (argc=1, argv=0x7fffffffdf48) at foo.c:5
5     printf("argc is %d", argc);
(gdb) p $sp
$1 = (void *) 0x7fffffffde40

(gdb) p $rbp
$2 = (void *) 0x7fffffffde60

(gdb) x/8x $sp
0x7fffffffde40: 0xffffdf48  0x00007fff  0x00400440  0x00000001
0x7fffffffde50: 0xffffdf40  0x00007fff  0x00000000  0x00000000

(gdb) p &argv
$3 = (char ***) 0x7fffffffde40
(gdb) p &argc
$4 = (int *) 0x7fffffffde4c

So I can see here that argv is pointing to the same address as $sp, the top of the stack, 0x7fffffffde40. And I also see that argc's address is shortly thereafter at 0x7fffffffde4c.

However, I'm not sure what the data at 0x7fffffffde48 through 0x7fffffffde4b is holding. Is it anything important, or just garbage? Why isn't argv directly adjacent to argc on the stack?

Thanks!

like image 301
Joseph Avatar asked Feb 07 '23 20:02

Joseph


1 Answers

In the x86-64 System V ABI, function args are passed in registers. (For links to other ABI docs, and explanations of what an ABI is, see the x86 tag wiki.)

They only have addresses at all because gcc -O0 spills them to the stack. This makes debugging C/C++ easier/more consistent: everything has an address, and the value stored there is always up-to-date after every C statement. However, it makes terribly inefficient asm code. gcc -Og isn't as strict about storing to memory all the time, so you sometimes get "value optimized out", but it's still "optimized for debugging".

The other goal of gcc -O0 is to compile fast, not to make good code. So don't be surprised that it makes non-optimal decisions about laying out locals on the stack. e.g. it could have only reserved 16 bytes, and put argv at [rbp-16] (8 byte aligned), argc at [rbp-8] (4 byte aligned), and kept the 4B temporary at [rbp-4] like gcc5.3's actual choice.

The only "reason" for the gap between their actual storage locations is the inner workings of gcc's algorithms for laying out locals, before any extra optimization passes.


To see what really happens when you compile a function, look at asm output (-S) from -O3 -march=native -fverbose-asm or something. (Do this with functions that take inputs and return a value, instead of compile-time constant inputs, so they don't optimize away.)

This is the start of main(), as compiled by gcc 5.3 on the Godbolt Compiler Explorer (with -O0 -fverbose-asm):

main:
    push    rbp     #
    mov     rbp, rsp  #,
    sub     rsp, 32   #,
    mov     DWORD PTR [rbp-20], edi   # argc, argc
    mov     QWORD PTR [rbp-32], rsi   # argv, argv
    mov     eax, DWORD PTR [rbp-20]   # tmp92, argc     # see how dumb gcc -O0 is: it reloads from memory instead of using the value in edi
    ...

On function entry, edi holds argc, and rsi holds argv. main()'s caller (the libc C runtime startup code) put them there. mov QWORD PTR [rbp-32], rsi is the instruction that stores argv to the bottom of the space reserved (with sub rsp, 32). [rbp-32] happens to be the same address as [rsp], but since gcc went to the trouble of making a stack frame (-fomit-frame-pointer is only the default at -O1 or higher), it addresses locals with offsets from rbp.


In the 32bit SysV ABI, those args would already be in memory on the stack on function entry, because that ABI unfortunately doesn't use any registers for arg-passing. The extra instructions and latency imposed by those extra store-forwarding round trips required by the legacy ABI are one of the reasons 32bit is slower than 64bit, even apart from the spills/reloads caused by having fewer registers. Some 32bit Windows ABIs use 2 regs for arg-passing, e.g. the __vectorcall ABI. That's good, because a lot of Windows programs are still distributed as 32bit. (64bit Linux systems usually don't have to run any 32bit code.)


BTW, the ABI standard documents how argc/argv/envp are placed on the stack for your newly-execve(2)ed process, and that most registers other than %rsp must be assumed to contain garbage. i.e. the process startup environment for _start, which is significantly different from what the C runtime code sets up before calling main(). e.g. on entry to _start, the top of the stack isn't a return address, so you can't ret. (You have to make an exit(2) system call, which is what eventually happens after you return from main().)

See the x86 tag wiki for lots more links to docs / tutorials / beginner-questions


like image 122
Peter Cordes Avatar answered Feb 16 '23 02:02

Peter Cordes