I want to learn about C calling convention. To do this I wrote the following code:
#include <stdio.h>
#include <stdlib.h>
struct tstStruct
{
void *sp;
int k;
};
void my_func(struct tstStruct*);
typedef struct tstStruct strc;
int main()
{
char a;
a = 'b';
strc* t1 = (strc*) malloc(sizeof(strc));
t1 -> sp = &a;
t1 -> k = 40;
my_func(t1);
return 0;
}
void my_func(strc* s1)
{
void* n = s1 -> sp + 121;
int d = s1 -> k + 323;
}
Then I used GCC with the following command:
gcc -S test3.c
and came up with its assembly. I won't show the whole code I got but rather paste the code for the function my_func. It is this:
my_func:
.LFB1:
.cfi_startproc
pushq %rbp
.cfi_def_cfa_offset 16
.cfi_offset 6, -16
movq %rsp, %rbp
.cfi_def_cfa_register 6
movq %rdi, -24(%rbp)
movq -24(%rbp), %rax
movq (%rax), %rax
addq $121, %rax
movq %rax, -16(%rbp)
movq -24(%rbp), %rax
movl 8(%rax), %eax
addl $323, %eax
movl %eax, -4(%rbp)
popq %rbp
.cfi_def_cfa 7, 8
ret
.cfi_endproc
As far as I understood, this is what happens: First the callers base pointer is pushed into stack and its stack pointer is made the new base pointer to set up the stack for the new function. But then the rest I don't understand. As far as I know, the arguments (or the pointer to the argument) is stored in the stack. If so what is the purpose of the second instruction,
movq -24(%rbp), %rax
Here, the content of the %rax register is moved to the address 24 bytes away from the address in the register %rbp. But what is in %rax???? Nothing is initially stored there??? I think I'm confused. Please help to understand how this function works. Thanks in advance!
You confuse AT&T syntax with Intel syntax.
movq -24(%rbp), %rax
In Intel syntax it would be
mov rax,[rbp-24]
So it moves data addressed by rbp
to rax
, and not vice versa. The order of operands is src, dest in AT&T syntax, whereas in Intel syntax it is dest, src.
Then, to get rid of GAS directives to make the disassembly easier to read, I assembled the code with gcc simply with gcc test3.c
and disassembled it with ndisasm -b 64 a.out
. Note the disassembly of my_func
function produced by NDISASM below is in Intel syntax:
000005EF 55 push rbp 000005F0 4889E5 mov rbp,rsp ; create the stack frame. 000005F3 48897DE8 mov [rbp-0x18],rdi ; s1 into a local variable. 000005F7 488B45E8 mov rax,[rbp-0x18] ; rax = s1 (it's a pointer) 000005FB 488B00 mov rax,[rax] ; dereference rax, store into rax. 000005FE 4883C079 add rax,byte +0x79 ; rax = rax + 121 00000602 488945F8 mov [rbp-0x8],rax ; void* n = s1 -> sp + 121 00000606 488B45E8 mov rax,[rbp-0x18] ; rax = pointer to s1 0000060A 8B4008 mov eax,[rax+0x8] ; dereference rax+8, store into eax. 0000060D 0543010000 add eax,0x143 ; eax = eax + 323 00000612 8945F4 mov [rbp-0xc],eax ; int d = s1 -> k + 323 00000615 5D pop rbp 00000616 C3 ret
For information on Linux x86-64 calling convention (System V ABI), see answers to What are the calling conventions for UNIX & Linux system calls on x86-64 .
The function is decomposed like this (I ignore the unnecessary lines):
First, there is the saving of the previous stack-frame:
pushq %rbp
movq %rsp, %rbp
Here, the old %rbp
is pushed onto the stack to be stored until the end of the function. Then, the %rbp
is set to the value of the new %rsp
(it is one line below the saved %rbp
as a push
occured).
movq %rdi, -24(%rbp)
Here you first have to know one of the major difference between the i386 system V ABI and the amd64 system V ABI.
In i386 System V ABI the function arguments are passed through the stack (and only through the stack). On the contrary, in amd64 System V ABI, the arguments are first passed through registers (%rdi
, %rsi
, %rdx
, %rcx
, %r8
and %r9
if it is integers, and %xmm0
to %xmm7
if this is floats). Once the number of registers have been exhausted, the remaining arguments are pushed to the stack as in i386.
So, here, the machine is just loading the first argument of the function (which is an integer) temporary onto the stack.
movq -24(%rbp), %rax
Because you cannot transfer data directly from one register to another one, the content of %rdi
is then loaded into %rax
. So, %rax
now store the first (and the only) argument of this function.
movq (%rax), %rax
This instruction is just dereferencing the pointer and storing the result back in %rax
.
addq $121, %rax
We add 121 to the pointed value.
movq %rax, -16(%rbp)
We store the obtained value onto the stack.
movq -24(%rbp), %rax
We load, again the first argument of the function in %rax
(remember that we stored the first argument at -24(%rbp)
).
movl 8(%rax), %eax
addl $323, %eax
As previously, we dereference the pointer and store the obtained value in %eax
and then we add 323 to it and put it back to %eax
.
Note, here, that we switched from %rax
to %eax
because the value that we are handling is not anymore a void*
(64bits) as previously but an int
(32bits).
movl %eax, -4(%rbp)
Finally, we store the result of this computation to the stack (which seems to be useless here, but it's probably something unnecessary that the compiler did not detect at compile time).
popq %rbp
ret
The two final instructions are just restoring the previous stack-frame before giving the hand back to the main
function.
I hope this make this behavior clearer now.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With