Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Understanding C disassembled call

I want to learn about C calling convention. To do this I wrote the following code:

#include <stdio.h>
#include <stdlib.h>

struct tstStruct
{
    void *sp;
    int k; 
};

void my_func(struct tstStruct*);

typedef struct tstStruct strc;

int main()
{
    char a;
    a = 'b';
    strc* t1 = (strc*) malloc(sizeof(strc));
    t1 -> sp = &a;
    t1 -> k = 40; 
    my_func(t1);
    return 0;   
}

void my_func(strc* s1)
{
        void* n = s1 -> sp + 121;
        int d = s1 -> k + 323;
}

Then I used GCC with the following command:

gcc -S test3.c

and came up with its assembly. I won't show the whole code I got but rather paste the code for the function my_func. It is this:

my_func:
.LFB1:
.cfi_startproc
pushq   %rbp
.cfi_def_cfa_offset 16
.cfi_offset 6, -16
movq    %rsp, %rbp
.cfi_def_cfa_register 6
movq    %rdi, -24(%rbp)
movq    -24(%rbp), %rax
movq    (%rax), %rax
addq    $121, %rax
movq    %rax, -16(%rbp)
movq    -24(%rbp), %rax
movl    8(%rax), %eax
addl    $323, %eax
movl    %eax, -4(%rbp)
popq    %rbp
.cfi_def_cfa 7, 8
ret
.cfi_endproc

As far as I understood, this is what happens: First the callers base pointer is pushed into stack and its stack pointer is made the new base pointer to set up the stack for the new function. But then the rest I don't understand. As far as I know, the arguments (or the pointer to the argument) is stored in the stack. If so what is the purpose of the second instruction,

movq        -24(%rbp), %rax

Here, the content of the %rax register is moved to the address 24 bytes away from the address in the register %rbp. But what is in %rax???? Nothing is initially stored there??? I think I'm confused. Please help to understand how this function works. Thanks in advance!

like image 831
user2290802 Avatar asked Apr 18 '13 16:04

user2290802


2 Answers

You confuse AT&T syntax with Intel syntax.

movq -24(%rbp), %rax

In Intel syntax it would be

mov rax,[rbp-24]

So it moves data addressed by rbp to rax, and not vice versa. The order of operands is src, dest in AT&T syntax, whereas in Intel syntax it is dest, src.

Then, to get rid of GAS directives to make the disassembly easier to read, I assembled the code with gcc simply with gcc test3.c and disassembled it with ndisasm -b 64 a.out. Note the disassembly of my_func function produced by NDISASM below is in Intel syntax:

000005EF  55                push rbp
000005F0  4889E5            mov rbp,rsp        ; create the stack frame.
000005F3  48897DE8          mov [rbp-0x18],rdi ; s1 into a local variable.
000005F7  488B45E8          mov rax,[rbp-0x18] ; rax = s1 (it's a pointer)
000005FB  488B00            mov rax,[rax]      ; dereference rax, store into rax.
000005FE  4883C079          add rax,byte +0x79 ; rax = rax + 121
00000602  488945F8          mov [rbp-0x8],rax  ; void* n = s1 -> sp + 121
00000606  488B45E8          mov rax,[rbp-0x18] ; rax = pointer to s1
0000060A  8B4008            mov eax,[rax+0x8]  ; dereference rax+8, store into eax.
0000060D  0543010000        add eax,0x143      ; eax = eax + 323
00000612  8945F4            mov [rbp-0xc],eax  ; int d = s1 -> k + 323
00000615  5D                pop rbp
00000616  C3                ret

For information on Linux x86-64 calling convention (System V ABI), see answers to What are the calling conventions for UNIX & Linux system calls on x86-64 .

like image 144
nrz Avatar answered Sep 23 '22 18:09

nrz


The function is decomposed like this (I ignore the unnecessary lines):

First, there is the saving of the previous stack-frame:

pushq   %rbp
movq    %rsp, %rbp

Here, the old %rbp is pushed onto the stack to be stored until the end of the function. Then, the %rbp is set to the value of the new %rsp (it is one line below the saved %rbp as a push occured).

movq    %rdi, -24(%rbp)

Here you first have to know one of the major difference between the i386 system V ABI and the amd64 system V ABI.

In i386 System V ABI the function arguments are passed through the stack (and only through the stack). On the contrary, in amd64 System V ABI, the arguments are first passed through registers (%rdi, %rsi, %rdx, %rcx, %r8 and %r9 if it is integers, and %xmm0 to %xmm7 if this is floats). Once the number of registers have been exhausted, the remaining arguments are pushed to the stack as in i386.

So, here, the machine is just loading the first argument of the function (which is an integer) temporary onto the stack.

movq    -24(%rbp), %rax

Because you cannot transfer data directly from one register to another one, the content of %rdi is then loaded into %rax. So, %rax now store the first (and the only) argument of this function.

movq    (%rax), %rax

This instruction is just dereferencing the pointer and storing the result back in %rax.

addq    $121, %rax

We add 121 to the pointed value.

movq    %rax, -16(%rbp)

We store the obtained value onto the stack.

movq    -24(%rbp), %rax

We load, again the first argument of the function in %rax (remember that we stored the first argument at -24(%rbp)).

movl    8(%rax), %eax
addl    $323, %eax

As previously, we dereference the pointer and store the obtained value in %eax and then we add 323 to it and put it back to %eax.

Note, here, that we switched from %rax to %eax because the value that we are handling is not anymore a void* (64bits) as previously but an int (32bits).

movl    %eax, -4(%rbp)

Finally, we store the result of this computation to the stack (which seems to be useless here, but it's probably something unnecessary that the compiler did not detect at compile time).

popq    %rbp
ret

The two final instructions are just restoring the previous stack-frame before giving the hand back to the main function.

I hope this make this behavior clearer now.

like image 38
perror Avatar answered Sep 24 '22 18:09

perror