Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

x86 calling convention: should arguments passed by stack be read-only?

It seems state-of-art compilers treat arguments passed by stack as read-only. Note that in the x86 calling convention, the caller pushes arguments onto the stack and the callee uses the arguments in the stack. For example, the following C code:

extern int goo(int *x);
int foo(int x, int y) {
  goo(&x);
  return x;
}

is compiled by clang -O3 -c g.c -S -m32 in OS X 10.10 into:

    .section    __TEXT,__text,regular,pure_instructions
    .macosx_version_min 10, 10
    .globl  _foo
    .align  4, 0x90
_foo:                                   ## @foo
## BB#0:
    pushl   %ebp
    movl    %esp, %ebp
    subl    $8, %esp
    movl    8(%ebp), %eax
    movl    %eax, -4(%ebp)
    leal    -4(%ebp), %eax
    movl    %eax, (%esp)
    calll   _goo
    movl    -4(%ebp), %eax
    addl    $8, %esp
    popl    %ebp
    retl


.subsections_via_symbols

Here, the parameter x(8(%ebp)) is first loaded into %eax; and then stored in -4(%ebp); and the address -4(%ebp) is stored in %eax; and %eax is passed to the function goo.

I wonder why Clang generates code that copy the value stored in 8(%ebp) to -4(%ebp), rather than just passing the address 8(%ebp) to the function goo. It would save memory operations and result in a better performance. I observed a similar behaviour in GCC too (under OS X). To be more specific, I wonder why compilers do not generate:

  .section  __TEXT,__text,regular,pure_instructions
    .macosx_version_min 10, 10
    .globl  _foo
    .align  4, 0x90
_foo:                                   ## @foo
## BB#0:
    pushl   %ebp
    movl    %esp, %ebp
    subl    $8, %esp
    leal    8(%ebp), %eax
    movl    %eax, (%esp)
    calll   _goo
    movl    8(%ebp), %eax
    addl    $8, %esp
    popl    %ebp
    retl


.subsections_via_symbols

I searched for documents if the x86 calling convention demands the passed arguments to be read-only, but I couldn't find anything on the issue. Does anybody have any thought on this issue?

like image 697
Jeehoon Kang Avatar asked May 17 '15 08:05

Jeehoon Kang


1 Answers

The rules for C are that parameters must be passed by value. A compiler converts from one language (with one set of rules) to a different language (potentially with a completely different set of rules). The only limitation is that the behaviour remains the same. The rules of the C language do not apply to the target language (e.g. assembly).

What this means is that if a compiler feels like generating assembly language where parameters are passed by reference and are not passed by value; then this is perfectly legal (as long as the behaviour remains the same).

The real limitation has nothing to do with C at all. The real limitation is linking. So that different object files can be linked together, standards are needed to ensure that whatever the caller in one object file expects matches whatever the callee in another object file provides. This is what's known as the ABI. In some cases (e.g. 64-bit 80x86) there are multiple different ABIs for the exact same architecture.

You can even invent your own ABI that's radically different (and implement your own tools that support your own radically different ABI) and that's perfectly legal as far as the C standards go; even if your ABI requires "pass by reference" for everything (as long as the behaviour remains the same).

like image 113
Brendan Avatar answered Oct 06 '22 23:10

Brendan