Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

In assembly code, how .cfi directive works?

[assembly code]

main:
.LFB0:
    .cfi_startproc
    pushl   %ebp
    .cfi_def_cfa_offset 8
    .cfi_offset 5, -8
    movl    %esp, %ebp
    .cfi_def_cfa_register 5
    andl    $-16, %esp
    subl    $32, %esp
    movl    $5, 20(%esp)
    movl    $3, 24(%esp)
    movl    24(%esp), %eax
    movl    %eax, 4(%esp)
    movl    20(%esp), %eax
    movl    %eax, (%esp)
    call    add
    movl    %eax, 28(%esp)
    movl    $0, %eax
    leave
    .cfi_restore 5
    .cfi_def_cfa 4, 4
    ret
    .cfi_endproc
.LFE0:
    .size   main, .-main
    .globl  add
    .type   add, @function
add:
.LFB1:
    .cfi_startproc
    pushl   %ebp
    .cfi_def_cfa_offset 8
    .cfi_offset 5, -8
    movl    %esp, %ebp
    .cfi_def_cfa_register 5
    subl    $16, %esp
    movl    12(%ebp), %eax
    movl    8(%ebp), %edx
    addl    %edx, %eax
    movl    %eax, -4(%ebp)
    movl    -4(%ebp), %eax
    leave
    .cfi_restore 5
    .cfi_def_cfa 4, 4
    ret
    .cfi_endproc

[source code]

int add(int k, int l);

int main(int argc, char **argv) { 
        int a, b, ret;
        a = 5;
        b = 3;
        ret = add(a, b); 
        return 0;
}

int add(int k, int l) { 
        int x;
        x = k + l;
        return x;
}

I'm studying the calling convention of c function at the assembly language level.

As you know, .cfi is used for adding debug information. I've read some cfi articles and known the meaning of each directives.

In the above assembly code, the .cfi_def_cfa_offset 8 and .cfi_offset 5 -8 directives comes in consecutively. This happens in 'main' function and 'add' function same again.

But, I don't know why this happens. What I know about is .cfi_def_cfa_offset and .cfi_offset are used for making reserve memory to store debug information. In this code, that offset is set to +8 at first, and -8 at second. The result is... there are no remained memory space to store cfi. Am I right?

I think that the stack segment work like this way.

.cfi_startproc
|-------------|
|  whatever   | <- %esp = CFA      ↑ increase address
|-------------|
|             |                    ↓ stack grow
|_____________|



.pushl  %ebp
|-------------|
|  whatever   | 
|-------------|
|   %ebp      | <- %esp
|_____________|


.cfi_def_cfa_offset 8
|-------------|
|  whatever   |  <- %esp
|-------------|
|   whatever  |
|-------------|
|   %ebp      |
|-------------|



.cfi_offset 5 -8 
|-------------|
|  whatever   |  
|-------------|
|   whatever  |
|-------------|
|   %ebp      | <- %esp
|-------------|



 subl $32, %esp
|-------------|
|   whatever  |
|-------------|
|    %ebp     |
|-------------|
|             |
|-------------|
|             |
|-------------|
|             |
|-------------|
|             |
|-------------|
|             |
|-------------|
|             |
|-------------|
|             |
|-------------|
|             | <- %esp
|-------------|



 movl $5, 20(%esp)
|-------------|
|   whatever  |
|-------------|
|    %ebp     |
|-------------|
|             |
|-------------|
|             |
|-------------|
|      5      |
|-------------|
|             |
|-------------|
|             |
|-------------|
|             |
|-------------|
|             |
|-------------|
|             | <- %esp
|-------------|

and so on...

Question 2.

At the procedure add, the parameter from the caller function is moved to callee function register.

    movl    12(%ebp), %eax
    movl    8(%ebp), %edx

But, in my calculation, 8(%ebp) is not pointing the top of the caller stack. Because,

1) at pushl %ebp, %esp is subtracted 4

2) at cfi_offset 5, -8, %esp is sbracted 8 (In this way, I neglect .cfi_def_cfa_offset 8. I'm not sure)

So, the top of the caller function stack should be 12(%ebp) in this way, and 8(%ebp) is pointing the stored base pointer of caller function.

I don't know where I don't know... I need your help.

-added

What do the CFI directives mean? (and some more questions)

This SO question is almost similar to me. But there's no one answers that question clearly.

like image 943
casamia Avatar asked Dec 25 '22 21:12

casamia


1 Answers

The .cfi directives do not generate any assembly code. They are not executed and do not change the layout of your call frame at all.

Instead they tell the tools which need to unwind the stack (the debugger, the exception unwinder) about the structure of the frame (and how to unwind it). Those informations are not stored alongside of the instructions but in another section of the program (see Note 1).

Let's look at the this snippet:

main:
.LFB0:
    .cfi_startproc
    pushl   %ebp
    .cfi_def_cfa_offset 8
    .cfi_offset 5, -8
    movl    %esp, %ebp
    .cfi_def_cfa_register 5
    andl    $-16, %esp
    subl    $32, %esp
    movl    $5, 20(%esp)
    movl    $3, 24(%esp)
    movl    24(%esp), %eax
    movl    %eax, 4(%esp)
    movl    20(%esp), %eax
    movl    %eax, (%esp)
    call    add
    movl    %eax, 28(%esp)
    movl    $0, %eax
    leave
    .cfi_restore 5
    .cfi_def_cfa 4, 4
    ret
    .cfi_endproc

The assembler will assemble the instructions in the .text segment and compile the .cfi directives in another section (.eh_frame or .debug_frame):

$ gcc -m32 -g test.s -c -o a.out
$ objdump -d a.out
[...]
00000000 <main>:
   0:   55                      push   %ebp
   1:   89 e5                   mov    %esp,%ebp
   3:   83 e4 f0                and    $0xfffffff0,%esp
   6:   83 ec 20                sub    $0x20,%esp
   9:   c7 44 24 14 05 00 00    movl   $0x5,0x14(%esp)
  10:   00 
  11:   c7 44 24 18 03 00 00    movl   $0x3,0x18(%esp)
  18:   00 
  19:   8b 44 24 18             mov    0x18(%esp),%eax
  1d:   89 44 24 04             mov    %eax,0x4(%esp)
  21:   8b 44 24 14             mov    0x14(%esp),%eax
  25:   89 04 24                mov    %eax,(%esp)
  28:   e8 fc ff ff ff          call   29 <main+0x29>
  2d:   89 44 24 1c             mov    %eax,0x1c(%esp)
  31:   b8 00 00 00 00          mov    $0x0,%eax
  36:   c9                      leave  
  37:   c3                      ret

Notice how only the instructions are present in the code of the main function. The CFI are somewhere else:

$ readelf -wF a.out 
Contents of the .eh_frame section:

00000000 00000014 00000000 CIE "zR" cf=1 df=-4 ra=8
   LOC   CFA      ra      
00000000 esp+4    c-4   

00000018 0000001c 0000001c FDE cie=00000000 pc=00000000..00000038
   LOC   CFA      ebp   ra      
00000000 esp+4    u     c-4   
00000001 esp+8    c-8   c-4   
00000003 ebp+8    c-8   c-4   
00000037 esp+4    u     c-4

The CFI are informations (not native CPU instructions) describing the layout of the frame.

Example

For example let's take this snippet:

.cfi_startproc
pushl   %ebp
.cfi_def_cfa_offset 8
.cfi_offset 5, -8

.cfi_startproc

cfi_startproc initializes the CFI for the function. At this point, the CFA (Canonical Frame Address, which is the address of %rsp in the caller frame) is given by %esp + 4 (because the caller pushed the return address in the call instruction):

whatever              <- CFA
return address  (ra)  <- %esp

The CFI directive is "compiled" in the .eh_frame:

   LOC   CFA      ebp   ra      
00000000 esp+4    u     c-4

.cfi_def_cfa_offset and .cfi_offset

At the pushl %ebp instruction, this does not hold anymore: cfa ≠ %esp + 4 because %esp has changed. After this instruction, we have cfa = %esp + 8. The debugger needs to know this and the .cfi_def_cfa_offset 8 directive is generating the suitable information in the .eh_frame section for the debugger: .cfi_def_cfa_offset 8 sets the offset to 8 in cfa = %esp + 8.

whatever             <- CFA  = %esp + 8
return address (ra)
caller %ebp          <- %esp (= CFA - 8)

The purpose of pushl %ebp was to save the value of %ebp from the caller on the stack. The debugger needs to know where this value was saved in order to unwind the stack and restore the caller frame. The .cfi_offset 5, -8 directives instruct the debugger that register 5 (%ebp) was saved by the previous instruction in cfa - 8.

This information is found in the next entry of the .eh_frame table:

   LOC   CFA      ebp   ra
[...]      
00000001 esp+8    c-8   c-4

Notes

Note 1: In some cases those informations are part of the debugging informations which means that it might not be present at runtime and might not be present in the file at all if the file was not compiled with debugging information.

like image 197
ysdx Avatar answered Jan 30 '23 00:01

ysdx