Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

C++ calling convention for passing big objects on Linux/x86-84

I'm trying to understand the overhead of pass object by value as a function parameter in C++/Linux/x86-64 platform.

The experimental code I used for the exploration is posted below and on godbolt.org: https://godbolt.org/z/r9Yfv4

Assume the function is unary. What I observed is:

  1. If the parameter object is a 8 bytes in size, it will be put in RDI.
  2. If the parameter is 16 bytes in size (contain two sub-object each 8 bytes), the two sub-objects will be put in to RDI and RSI.
  3. If the parameter is bigger than 16 bytes, it will be passed via stack.

I only consider integral types and pointers and the composition types of these basic types. I know passing floats/doubles is different.

The size of std::function is 32 bytes (GCC/Linux implementation, long + long + pointer + pointer = 32 bytes.). So passing std::function by value should look like pass struct Person4 defined in my code. But the output assembly shows that pass std::function is very different from pass struct Person3. It looks like std::function is passed via a pointer, am I right? Why there is such a difference?

#include <functional>

struct Person0 {
  long name;
};

long GetName(Person0 p) {
  return p.name;
}

struct Person1 {
  long name;
  long age;
};

long GetName(Person1 p) {
  return p.name;
}

struct Person2 {
  long name;
  long age;
  long height;
};

long GetName(Person2 p) {
  return p.name;
}

struct Person3 {
  long name;
  long age;
  long height;
  long weight;
};

long GetName(Person3 p) {
  return p.name + sizeof(p);
}

long Invoke(std::function<long(long)> f) {
  return f(20) + sizeof(f);
}


int main() {
  Person3 p;
  p.name = 13;
  p.age = 23;
  p.height = 33;
  p.weight = 43;
  long n = GetName(p);

  std::function<long(long)> ff;
  Invoke(ff);
  return 0;
}
like image 767
Xiaoyong Guo Avatar asked Jan 22 '21 02:01

Xiaoyong Guo


People also ask

What calling convention does Linux use?

In Linux, GCC sets the de facto standard for calling conventions. Since GCC version 4.5, the stack must be aligned to a 16-byte boundary when calling a function (previous versions only required a 4-byte alignment).

What are the different calling conventions C language on x86 processor?

There are three major calling conventions that are used with the C language on 32-bit x86 processors: STDCALL, CDECL, and FASTCALL. In addition, there is another calling convention typically used with C++: THISCALL. There are other calling conventions as well, including PASCAL and FORTRAN conventions, among others.

What is __ cdecl in C++?

The __cdecl function specifier (C++ only) The __cdecl keyword instructs the compiler to read and write a parameter list by using C linkage conventions. To set the __cdecl calling convention for a function, place the linkage keyword immediately before the function name or at the beginning of the declarator.

Are argument registers callee saved?

A callee may use these registers, but if it changes them, it must restore them to their original values before returning. These registers are called callee-saved registers. All other registers are caller-saved.

What are the x86-64 system call conventions on Unix & Linux?

But what are the x86-64 system call conventions on both UNIX & Linux? There is no "standard" for Unix calling conventions. For linux sure, but I'm sure that Solaris, OpenBSD, Linux and Minix probably have different at least slightly different calling conventions and they are all unix.

What is the calling convention of AMD64 ABI in Linux?

The calling convention of the System V AMD64 ABI is followed on GNU/ Linux. The registers RDI, RSI, RDX, RCX, R8, and R9 are used for integer and memory address arguments and XMM0, XMM1, XMM2, XMM3, XMM4, XMM5, XMM6 and XMM7 are used for floating point arguments. For system calls, R10 is used instead of RCX.

What is the default calling convention for x86 C?

The cdecl calling convention is usually the default calling convention for x86 C compilers, although many compilers provide options to automatically change the calling conventions used.

What is the C calling convention?

The C calling convention is based heavily on the use of the hardware-supported stack. To understand the C calling convention, you should first make sure that you fully understand the push, pop, call, and ret instructions – these will be the basis for most of the rules.


Video Answer


1 Answers

The document that you want to read is the System V ABI for x86-64, in particular, section 3.2.3 «Paramater Passing»

Structs that are > 32 bytes, go always on the stack. For structs that are <= 32 bytes, there is some logic going on:

Paramater Passing Rules

The post merger cleanup says that given that the size is greater than 2 eighbytes (16 bytes), and the first parameter is not an SSE, or any other parameter is not SSEUP, the whole aggregate is classified as MEMORY (stack).

Regarding the use of std::function, there is one last rule that might explain it:

  1. If a C++ object has either a non-trivial copy constructor or a non-trivial destructor, it is passed by invisible reference (the object is replaced in the parameter list by a pointer that has class INTEGER)
like image 74
Marco Avatar answered Oct 16 '22 20:10

Marco