Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Passing parameters one by one, or by wrapping them in an array, struct or tuple

When passing arguments to a function I always assumed that passing arguments one by one is not different from passing them wrapped in an array or a struct or a tuple. However, a simple experiment showed that I was wrong.

The following program when compiled with GCC:

int test(int a, int b, int c, int d) {
    return a + b + c + d;
}

int test(std::array<int, 4> arr) {
    return arr[0] + arr[1] + arr[2] + arr[3];
}

struct abcd {
    int a; int b; int c; int d;
};

int test(abcd s) {
    return s.a + s.b + s.c + s.d;
}

int test(std::tuple<int, int, int, int> tup) {
    return std::get<0>(tup) + std::get<1>(tup) + std::get<2>(tup) + std::get<3>(tup);
}

...produces a variety of assembly outputs:

impl_test(int, int, int, int):
    lea eax, [rdi+rsi]
    add eax, edx
    add eax, ecx
    ret

impl_test(std::array<int, 4ul>):
    mov rax, rdi
    sar rax, 32
    add eax, edi
    add eax, esi
    sar rsi, 32
    add eax, esi
    ret

impl_test(abcd):
    mov rax, rdi
    sar rax, 32
    add eax, edi
    add eax, esi
    sar rsi, 32
    add eax, esi
    ret

impl_test(std::tuple<int, int, int, int>):
    mov eax, DWORD PTR [rdi+8]
    add eax, DWORD PTR [rdi+12]
    add eax, DWORD PTR [rdi+4]
    add eax, DWORD PTR [rdi]
    ret

main:
    push    rbp
    push    rbx
    mov ecx, 4
    mov edx, 3
    movabs  rbp, 8589934592
    mov esi, 2
    sub rsp, 24
    mov edi, 1
    movabs  rbx, 17179869184
    call    int test<int, int, int, int>(int, int, int, int)

    mov rdi, rbp
    mov rsi, rbx
    or  rbx, 3
    or  rdi, 1
    or  rsi, 3
    call    int test<std::array<int, 4ul> >(std::array<int, 4ul>)

    mov rdi, rbp
    mov rsi, rbx
    or  rdi, 1
    call    int test<abcd>(abcd)

    mov rdi, rsp
    mov DWORD PTR [rsp], 4
    mov DWORD PTR [rsp+4], 3
    mov DWORD PTR [rsp+8], 2
    mov DWORD PTR [rsp+12], 1
    call    int test<std::tuple<int, int, int, int> >(std::tuple<int, int, int, int>)

    add rsp, 24
    xor eax, eax
    pop rbx
    pop rbp
    ret

Why is there a difference?

like image 634
StackedCrooked Avatar asked May 08 '15 01:05

StackedCrooked


1 Answers

When a function is called (that is, not inlined, constexpr evaluated or eliminated), the way arguments are passed depends on many factors including:

  • Whether the argument is an integer or floating-point if the argument is of a primitive type.
  • The type of the argument.
  • Whether its address is taken in some non-eliminated code in the callee.
  • The default or specified calling convention.
  • Whether Whole Program Optimization (WPO) is being used.
  • Whether the callee is in a shared library, static library or object file, or in the same translation unit.
  • The specified floating-point behavior.
  • The target platform.
  • The position of the parameter in the parameter list.

Let's get back to the example you provided. You compiled the code with -02 so dead code won't be eliminated and function inlining is disabled. So all functions have to be called. The target platform is x64.

The first function has four 4-byte integer parameters. Therefore, all of them are passed through registers.

The second function has one fixed-size array of four 4-byte integers. The compiler decided to use two registers (rdi and rsi) to pass the four integers where rdi = 0x200000001 and rsi = 0x400000003. Notice how the four integers (1, 2, 3, 4) are compactly passed using these two registers.

Passing the integers as a structure rather then one by one made the compiler use different techniques to pass them. But there is a trade off here between the size of code, speed and number of registers required.

The same thing goes for the third function.

The last function, however, contains calls to std::get which require the address of the passed tuple. So the address is stored in rdi to be used by the std::get function. Since you're compiling with C++14, std::get is marked with constexpr. The compiler was able to evaluate the function and therefore the memory access has been emitted in test function rather than emitting a call to the std::get function. Notice that this is different from inlining.

like image 105
Hadi Brais Avatar answered Nov 20 '22 00:11

Hadi Brais