Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

GCC/Clang x86_64 C++ ABI mismatch when returning a tuple?

When trying to optimize return values on x86_64, I noticed a strange thing. Namely, given the code:

#include <cstdint>
#include <tuple>
#include <utility>

using namespace std;

constexpr uint64_t a = 1u;
constexpr uint64_t b = 2u;

pair<uint64_t, uint64_t> f() { return {a, b}; }
tuple<uint64_t, uint64_t> g() { return tuple<uint64_t, uint64_t>{a, b}; }

Clang 3.8 outputs this assembly code for f:

movl $1, %eax
movl $2, %edx
retq

and this for g:

movl $2, %eax
movl $1, %edx
retq

which look optimal. However, when compiled with GCC 6.1, while the generated assembly for f is identical to what Clang output, the assembly generated for g is:

movq %rdi, %rax
movq $2, (%rdi)
movq $1, 8(%rdi)
ret

It looks like the type of the return value is classified as MEMORY by GCC but as INTEGER by Clang. I can confirm that linking Clang code with GCC code such code can result in segmentation faults (Clang calling GCC-compiled g() which writes to wherever %rdi happens to point) and an invalid value being returned (GCC calling Clang-compiled g()). Which compiler is at fault?

Related:

  • G++ and clang++ incompatibility with standard library when building shared libraries?
  • [cxx-abi-dev] Non-trivial move constructor

See also

  • System V Application Binary Interface. AMD64 Architecture Processor Supplement. Draft Version 0.99.5
like image 493
jotik Avatar asked May 26 '16 09:05

jotik


People also ask

What is the x86-64 calling convention by GCC?

What is the x86-64 calling convention by gcc? The calling convention of the System V AMD64 ABI is followed on GNU/ Linux. The registers RDI, RSI, RDX, RCX, R8, and R9 are used for integer and memory address arguments and XMM0, XMM1, XMM2, XMM3, XMM4, XMM5, XMM6 and XMM7 are used for floating point arguments.

How to compile 32-bit gcc with x86_64-linux-GNU?

Hence the fourth line Target: x86_64-linux-gnu confirms that we are running 64-bit gcc. Now in order to compile with 32-bit gcc, just add a flag -m32 in the command line of compiling the ‘C’ language program.

What is the calling convention of AMD64 ABI in Linux?

The calling convention of the System V AMD64 ABI is followed on GNU/ Linux. The registers RDI, RSI, RDX, RCX, R8, and R9 are used for integer and memory address arguments and XMM0, XMM1, XMM2, XMM3, XMM4, XMM5, XMM6 and XMM7 are used for floating point arguments. For system calls, R10 is used instead of RCX.


2 Answers

The ABI states that parameter values are classified according to a specific algorithm. Relevant here is:

  1. If the size of the aggregate exceeds a single eightbyte, each is classified separately. Each eightbyte gets initialized to class NO_CLASS.

  2. Each field of an object is classified recursively so that always two fields are considered. The resulting class is calculated according to the classes of the fields in the eightbyte:

In this case, each of the fields (for either a tuple or a pair) are of type uint64_t and so occupy an entire "eightbyte". The "two fields" to be considered in each eightbyte, then, are the "NO_CLASS" (as per 3) and the uint64_t field, which is classified as INTEGER.

There is also, related to parameter passing:

If a C++ object has either a non-trivial copy constructor or a non-trivial destructor, it is passed by invisible reference (the object is replaced in the parameter list by a pointer that has class INTEGER)

An object that doesn't meet those requirements must have an address, and therefore needs to be in memory, which is why the above requirement exists. The same is true for return values, though this seems to be an omitted in the specification (probably by accident).

Finally, there is:

(c) If the size of the aggregate exceeds two eightbytes and the first eight-byte isn’t SSE or any other eightbyte isn’t SSEUP, the whole argument is passed in memory.

That doesn't apply here, obviously; the size of the aggregate is exactly two eightbytes.

On returning of values, the text says:

  1. Classify the return type with the classification algorithm

Which means, as per above, that the tuple should be classifed as INTEGER. Then:

  1. If the class is INTEGER, the next available register of the sequence %rax, %rdx is used.

This is quite clear.

The only still-open question is whether the types are non-trivially-copy-constructible/destructible. As mentioned above, values of such type cannot be passed or returned in registers, even though the specification does not seem to recognize the problem for return values. However, we can easily show that the tuple and pair are both trivially-copy-constructible and trivially-destructible, using the following program:

Test program:

#include <utility>
#include <cstdint>
#include <tuple>
#include <iostream>

using namespace std;

int main(int argc, char **argv)
{
    cout << "pair is trivial? : " << is_trivial<pair<uint64_t, uint64_t> >::value << endl;
    cout << "pair is trivially_copy_constructible? : " << is_trivially_copy_constructible<pair<uint64_t, uint64_t> >::value << endl;
    cout << "pair is standard_layout? : " << is_standard_layout<pair<uint64_t, uint64_t> >::value << endl;
    cout << "pair is pod? : " << is_pod<pair<uint64_t, uint64_t> >::value << endl;
    cout << "pair is trivially_destructable? : " << is_trivially_destructible<pair<uint64_t, uint64_t> >::value << endl;
    cout << "pair is trivially_move_constructible? : " << is_trivially_move_constructible<pair<uint64_t, uint64_t> >::value << endl;

    cout << "tuple is trivial? : " << is_trivial<tuple<uint64_t, uint64_t> >::value << endl;
    cout << "tuple is trivially_copy_constructible? : " << is_trivially_copy_constructible<tuple<uint64_t, uint64_t> >::value << endl;
    cout << "tuple is standard_layout? : " << is_standard_layout<tuple<uint64_t, uint64_t> >::value << endl;
    cout << "tuple is pod? : " << is_pod<tuple<uint64_t, uint64_t> >::value << endl;
    cout << "tuple is trivially_destructable? : " << is_trivially_destructible<tuple<uint64_t, uint64_t> >::value << endl;
    cout << "tuple is trivially_move_constructible? : " << is_trivially_move_constructible<tuple<uint64_t, uint64_t> >::value << endl;
    return 0;
}

Output when compiled with GCC or Clang:

pair is trivial? : 0
pair is trivially_copy_constructible? : 1
pair is standard_layout? : 1
pair is pod? : 0
pair is trivially_destructable? : 1
pair is trivially_move_constructible? : 1
tuple is trivial? : 0
tuple is trivially_copy_constructible? : 1
tuple is standard_layout? : 0
tuple is pod? : 0
tuple is trivially_destructable? : 1
tuple is trivially_move_constructible? : 0

This implies that GCC is getting it wrong. The return value should be passed in %rax,%rdx.

(The main noticable differences between the types is that pair is standard layout and is trivially move-constructible whereas tuple is not, so it's possible that GCC is always returning non-trivially-move-constructible values via a pointer, for example).

like image 138
davmac Avatar answered Sep 21 '22 15:09

davmac


As davmac's answer shows, the libstdc++ std::tuple is trivially copy constructible, but not trivially move constructible. The two compilers disagree on whether the move constructor should affect the argument passing conventions.

The C++ ABI thread you linked to seems to explain that disagreement: http://sourcerytools.com/pipermail/cxx-abi-dev/2016-February/002891.html

In summary, Clang implements exactly what the ABI spec says, but G++ implements what it was supposed to say, but wasn't updated to actually say.

like image 33
Jonathan Wakely Avatar answered Sep 17 '22 15:09

Jonathan Wakely