I'm trying to link x86 assembly and C. My C program: <pre class="prettyprint"><code>extern int plus_10(int); # include <stdio.h> int main() { int x = plus_10(40); printf("%d\n", x); return 0; } </code></pre> My assembly program: <pre class="prettyprint"><code>[bits 32] section .text global plus_10 plus_10: pop edx mov eax, 10 add eax, edx ret </code></pre> I compile and link the two as follows: <pre class="prettyprint"><code>gcc -c prog.c -o prog_c.o -m32 nasm -f elf32 prog.asm -o prog_asm.o gcc prog_c.o prog_asm.o -m32 </code></pre> However, when I run the resulting file, I get a segmentation fault. But when I replace <blockquote> pop edx </blockquote> with <blockquote> mov edx, [esp+4] </blockquote> the program works fine. Can someone please explain why this happens?

This is a possible assembly code of <code>int x = plus_10(40);</code> <pre class="prettyprint"><code> push 40 ; push argument call plus_10 ; call function retadd: add esp, 4 ; clean up stack (dummy pop) ; result of the function call is in EAX, per the calling convention ; if compiled without optimization, the caller might just store it: mov DWORD PTR [ebp-x], eax ; store return value ; (in eax) in x </code></pre> Now when you call <code>plus_10</code>, the address <code>retadd</code> is pushed on the stack by the <code>call</code> instruction. It's effectively a <code>push</code>+<code>jmp</code>, and <code>ret</code> is effectively <code>pop eip</code>. So your stack looks like this in the <code>plus_10</code> function: <pre class="prettyprint"><code>| ... | +--------+ | 40 | <- ESP+4 points here (the function argument) +--------+ | retadd | <- ESP points here +--------+ </code></pre> <code>ESP</code> points to a memory location that contains the return address. Now if you use <code>pop edx</code> the return address goes into <code>edx</code> and the stack looks like this: <pre class="prettyprint"><code>| ... | +--------+ | 40 | <- ESP points here +--------+ </code></pre> Now if you execute <code>ret</code> at this point, the program will actually jump to address 40 and most likely segfault or behave in some other unpredictable way. The actual assembly code generated by the compiler may be different, but this illustrates the problem. <hr> BTW, a more efficient way to write your function is this: it's what most compilers would do with optimization enabled, for a non-inline version of this tiny function. <pre class="prettyprint"><code>global plus_10 plus_10: mov eax, [esp+4] ; retval = first arg add eax, 10 ; retval += 10 ret </code></pre> This is smaller and slightly more efficient than <pre class="prettyprint"><code> mov eax, 10 add eax, [esp+4] ; decode to a load + add. ret </code></pre>

Segmentation fault when popping x86 stack

Tags:

c

x86

gcc

assembly

stack-memory

nasm

I'm trying to link x86 assembly and C.

My C program:

extern int plus_10(int);

# include <stdio.h>

int main() {
    int x = plus_10(40);
    printf("%d\n", x);
    return 0;
}

My assembly program:

[bits 32]

section .text

global plus_10
plus_10:
    pop edx
    mov eax, 10
    add eax, edx
    ret

I compile and link the two as follows:

gcc -c prog.c -o prog_c.o -m32
nasm -f elf32 prog.asm -o prog_asm.o
gcc prog_c.o prog_asm.o -m32

However, when I run the resulting file, I get a segmentation fault.

But when I replace

pop edx

with

mov edx, [esp+4]

the program works fine. Can someone please explain why this happens?

756

asked May 13 '19 13:05

Susmit Agrawal

1 Answers

This is a possible assembly code of int x = plus_10(40);

        push    40                      ; push argument
        call    plus_10                 ; call function
retadd: add     esp, 4                  ; clean up stack (dummy pop)
        ; result of the function call is in EAX, per the calling convention

        ; if compiled without optimization, the caller might just store it:
        mov     DWORD PTR [ebp-x], eax  ; store return value
                                        ; (in eax) in x

Now when you call plus_10, the address retadd is pushed on the stack by the call instruction. It's effectively a push+jmp, and ret is effectively pop eip.

So your stack looks like this in the plus_10 function:

|  ...   |
+--------+
|   40   |  <- ESP+4 points here (the function argument)
+--------+
| retadd |  <- ESP points here
+--------+

ESP points to a memory location that contains the return address.

Now if you use pop edx the return address goes into edx and the stack looks like this:

|  ...   |
+--------+
|   40   |  <- ESP points here
+--------+

Now if you execute ret at this point, the program will actually jump to address 40 and most likely segfault or behave in some other unpredictable way.

The actual assembly code generated by the compiler may be different, but this illustrates the problem.

BTW, a more efficient way to write your function is this: it's what most compilers would do with optimization enabled, for a non-inline version of this tiny function.

global plus_10
plus_10:
    mov   eax,  [esp+4]    ; retval = first arg
    add   eax,  10         ; retval += 10
    ret

This is smaller and slightly more efficient than

    mov   eax,  10
    add   eax,  [esp+4]        ; decode to a load + add.
    ret

130

answered Oct 05 '22 16:10

Jabberwocky

Related questions
                            
                                Can knowing C actually hurt the code you write in higher level languages?
                            
                                Given a starting and ending indices, how can I copy part of a string in C?
                            
                                Struct Reordering by compiler [duplicate]
                            
                                What happens when you try to free() already freed memory in c?
                            
                                How to implement a "private/restricted" function in C?
                            
                                check all socket opened in linux OS
                            
                                Does `break` work only for `for`, `while`, `do-while`, `switch' and for `if` statements?
                            
                                C pointer arithmetic without object of structure
                            
                                initializing an array of ints
                            
                                In C/C++ why does the do while(expression); need a semi colon?
                            
                                In C is "i+=1;" atomic? [duplicate]
                            
                                Fast divisibility tests (by 2,3,4,5,.., 16)?
                            
                                Program with loop will not terminate with CTRL + C
                            
                                Simple ways to disable parts of code
                            
                                Unexpected optimization of strlen when aliasing 2-d array
                            
                                Secure this invaluable documentation on using C/C++ with GSSAPI and SASL
                            
                                Raw H264 frames in mpegts container using libavcodec
                            
                                How to determine which compiler has been used to compile an executable?
                            
                                Signal number to name?
                            
                                Signedness of enum in C/C99/C++/C++x/GNU C/GNU C99

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With