When trying to optimize return values on x86_64, I noticed a strange thing. Namely, given the code: <pre class="prettyprint"><code>#include <cstdint> #include <tuple> #include <utility> using namespace std; constexpr uint64_t a = 1u; constexpr uint64_t b = 2u; pair<uint64_t, uint64_t> f() { return {a, b}; } tuple<uint64_t, uint64_t> g() { return tuple<uint64_t, uint64_t>{a, b}; } </code></pre> Clang 3.8 outputs this assembly code for <code>f</code>: <pre class="prettyprint"><code>movl $1, %eax movl $2, %edx retq </code></pre> and this for <code>g</code>: <pre class="prettyprint"><code>movl $2, %eax movl $1, %edx retq </code></pre> which look optimal. However, when compiled with GCC 6.1, while the generated assembly for <code>f</code> is identical to what Clang output, the assembly generated for <code>g</code> is: <pre class="prettyprint"><code>movq %rdi, %rax movq $2, (%rdi) movq $1, 8(%rdi) ret </code></pre> It looks like the type of the return value is classified as MEMORY by GCC but as INTEGER by Clang. I can confirm that linking Clang code with GCC code such code can result in segmentation faults (Clang calling GCC-compiled <code>g()</code> which writes to wherever <code>%rdi</code> happens to point) and an invalid value being returned (GCC calling Clang-compiled <code>g()</code>). Which compiler is at fault? <h3>Related:</h3> <ul> <li>G++ and clang++ incompatibility with standard library when building shared libraries?</li> <li>[cxx-abi-dev] Non-trivial move constructor</li> </ul> <h3>See also</h3> <ul> <li>System V Application Binary Interface. AMD64 Architecture Processor Supplement. Draft Version 0.99.5</li> </ul>

The ABI states that parameter values are classified according to a specific algorithm. Relevant here is: <blockquote> <ol start="3"> <li>If the size of the aggregate exceeds a single eightbyte, each is classified separately. Each eightbyte gets initialized to class NO_CLASS.</li> <li>Each field of an object is classified recursively so that always two fields are considered. The resulting class is calculated according to the classes of the fields in the eightbyte:</li> </ol> </blockquote> In this case, each of the fields (for either a tuple or a pair) are of type <code>uint64_t</code> and so occupy an entire "eightbyte". The "two fields" to be considered in each eightbyte, then, are the "NO_CLASS" (as per 3) and the <code>uint64_t</code> field, which is classified as INTEGER. There is also, related to parameter passing: <blockquote> If a C++ object has either a non-trivial copy constructor or a non-trivial destructor, it is passed by invisible reference (the object is replaced in the parameter list by a pointer that has class INTEGER) </blockquote> An object that doesn't meet those requirements must have an address, and therefore needs to be in memory, which is why the above requirement exists. The same is true for return values, though this seems to be an omitted in the specification (probably by accident). Finally, there is: <blockquote> (c) If the size of the aggregate exceeds two eightbytes and the first eight-byte isn’t SSE or any other eightbyte isn’t SSEUP, the whole argument is passed in memory. </blockquote> That doesn't apply here, obviously; the size of the aggregate is exactly two eightbytes. On returning of values, the text says: <blockquote> <ol> <li>Classify the return type with the classification algorithm</li> </ol> </blockquote> Which means, as per above, that the tuple should be classifed as INTEGER. Then: <blockquote> <ol start="3"> <li>If the class is INTEGER, the next available register of the sequence %rax, %rdx is used.</li> </ol> </blockquote> This is quite clear. The only still-open question is whether the types are non-trivially-copy-constructible/destructible. As mentioned above, values of such type cannot be passed or returned in registers, even though the specification does not seem to recognize the problem for return values. However, we can easily show that the tuple and pair are both trivially-copy-constructible and trivially-destructible, using the following program: Test program: <pre class="prettyprint"><code>#include <utility> #include <cstdint> #include <tuple> #include <iostream> using namespace std; int main(int argc, char **argv) { cout << "pair is trivial? : " << is_trivial<pair<uint64_t, uint64_t> >::value << endl; cout << "pair is trivially_copy_constructible? : " << is_trivially_copy_constructible<pair<uint64_t, uint64_t> >::value << endl; cout << "pair is standard_layout? : " << is_standard_layout<pair<uint64_t, uint64_t> >::value << endl; cout << "pair is pod? : " << is_pod<pair<uint64_t, uint64_t> >::value << endl; cout << "pair is trivially_destructable? : " << is_trivially_destructible<pair<uint64_t, uint64_t> >::value << endl; cout << "pair is trivially_move_constructible? : " << is_trivially_move_constructible<pair<uint64_t, uint64_t> >::value << endl; cout << "tuple is trivial? : " << is_trivial<tuple<uint64_t, uint64_t> >::value << endl; cout << "tuple is trivially_copy_constructible? : " << is_trivially_copy_constructible<tuple<uint64_t, uint64_t> >::value << endl; cout << "tuple is standard_layout? : " << is_standard_layout<tuple<uint64_t, uint64_t> >::value << endl; cout << "tuple is pod? : " << is_pod<tuple<uint64_t, uint64_t> >::value << endl; cout << "tuple is trivially_destructable? : " << is_trivially_destructible<tuple<uint64_t, uint64_t> >::value << endl; cout << "tuple is trivially_move_constructible? : " << is_trivially_move_constructible<tuple<uint64_t, uint64_t> >::value << endl; return 0; } </code></pre> Output when compiled with GCC or Clang: <pre class="prettyprint"><code>pair is trivial? : 0 pair is trivially_copy_constructible? : 1 pair is standard_layout? : 1 pair is pod? : 0 pair is trivially_destructable? : 1 pair is trivially_move_constructible? : 1 tuple is trivial? : 0 tuple is trivially_copy_constructible? : 1 tuple is standard_layout? : 0 tuple is pod? : 0 tuple is trivially_destructable? : 1 tuple is trivially_move_constructible? : 0 </code></pre> This implies that GCC is getting it wrong. The return value should be passed in %rax,%rdx. (The main noticable differences between the types is that <code>pair</code> is standard layout and is trivially move-constructible whereas <code>tuple</code> is not, so it's possible that GCC is always returning non-trivially-move-constructible values via a pointer, for example).

As davmac's answer shows, the libstdc++ <code>std::tuple</code> is trivially copy constructible, but not trivially move constructible. The two compilers disagree on whether the move constructor should affect the argument passing conventions. The C++ ABI thread you linked to seems to explain that disagreement: http://sourcerytools.com/pipermail/cxx-abi-dev/2016-February/002891.html In summary, Clang implements exactly what the ABI spec says, but G++ implements what it was supposed to say, but wasn't updated to actually say.

GCC/Clang x86_64 C++ ABI mismatch when returning a tuple?

Tags:

c++

tuples

x86-64

abi

compiler-bug

When trying to optimize return values on x86_64, I noticed a strange thing. Namely, given the code:

#include <cstdint>
#include <tuple>
#include <utility>

using namespace std;

constexpr uint64_t a = 1u;
constexpr uint64_t b = 2u;

pair<uint64_t, uint64_t> f() { return {a, b}; }
tuple<uint64_t, uint64_t> g() { return tuple<uint64_t, uint64_t>{a, b}; }

Clang 3.8 outputs this assembly code for f:

movl $1, %eax
movl $2, %edx
retq

and this for g:

movl $2, %eax
movl $1, %edx
retq

which look optimal. However, when compiled with GCC 6.1, while the generated assembly for f is identical to what Clang output, the assembly generated for g is:

movq %rdi, %rax
movq $2, (%rdi)
movq $1, 8(%rdi)
ret

It looks like the type of the return value is classified as MEMORY by GCC but as INTEGER by Clang. I can confirm that linking Clang code with GCC code such code can result in segmentation faults (Clang calling GCC-compiled g() which writes to wherever %rdi happens to point) and an invalid value being returned (GCC calling Clang-compiled g()). Which compiler is at fault?

G++ and clang++ incompatibility with standard library when building shared libraries?
[cxx-abi-dev] Non-trivial move constructor

2 Answers

The ABI states that parameter values are classified according to a specific algorithm. Relevant here is:

If the size of the aggregate exceeds a single eightbyte, each is classified separately. Each eightbyte gets initialized to class NO_CLASS.

Each field of an object is classified recursively so that always two fields are considered. The resulting class is calculated according to the classes of the fields in the eightbyte:

In this case, each of the fields (for either a tuple or a pair) are of type uint64_t and so occupy an entire "eightbyte". The "two fields" to be considered in each eightbyte, then, are the "NO_CLASS" (as per 3) and the uint64_t field, which is classified as INTEGER.

There is also, related to parameter passing:

If a C++ object has either a non-trivial copy constructor or a non-trivial destructor, it is passed by invisible reference (the object is replaced in the parameter list by a pointer that has class INTEGER)

An object that doesn't meet those requirements must have an address, and therefore needs to be in memory, which is why the above requirement exists. The same is true for return values, though this seems to be an omitted in the specification (probably by accident).

Finally, there is:

(c) If the size of the aggregate exceeds two eightbytes and the first eight-byte isn’t SSE or any other eightbyte isn’t SSEUP, the whole argument is passed in memory.

That doesn't apply here, obviously; the size of the aggregate is exactly two eightbytes.

On returning of values, the text says:

Classify the return type with the classification algorithm

Which means, as per above, that the tuple should be classifed as INTEGER. Then:

If the class is INTEGER, the next available register of the sequence %rax, %rdx is used.

This is quite clear.

The only still-open question is whether the types are non-trivially-copy-constructible/destructible. As mentioned above, values of such type cannot be passed or returned in registers, even though the specification does not seem to recognize the problem for return values. However, we can easily show that the tuple and pair are both trivially-copy-constructible and trivially-destructible, using the following program:

Test program:

#include <utility>
#include <cstdint>
#include <tuple>
#include <iostream>

using namespace std;

int main(int argc, char **argv)
{
    cout << "pair is trivial? : " << is_trivial<pair<uint64_t, uint64_t> >::value << endl;
    cout << "pair is trivially_copy_constructible? : " << is_trivially_copy_constructible<pair<uint64_t, uint64_t> >::value << endl;
    cout << "pair is standard_layout? : " << is_standard_layout<pair<uint64_t, uint64_t> >::value << endl;
    cout << "pair is pod? : " << is_pod<pair<uint64_t, uint64_t> >::value << endl;
    cout << "pair is trivially_destructable? : " << is_trivially_destructible<pair<uint64_t, uint64_t> >::value << endl;
    cout << "pair is trivially_move_constructible? : " << is_trivially_move_constructible<pair<uint64_t, uint64_t> >::value << endl;

    cout << "tuple is trivial? : " << is_trivial<tuple<uint64_t, uint64_t> >::value << endl;
    cout << "tuple is trivially_copy_constructible? : " << is_trivially_copy_constructible<tuple<uint64_t, uint64_t> >::value << endl;
    cout << "tuple is standard_layout? : " << is_standard_layout<tuple<uint64_t, uint64_t> >::value << endl;
    cout << "tuple is pod? : " << is_pod<tuple<uint64_t, uint64_t> >::value << endl;
    cout << "tuple is trivially_destructable? : " << is_trivially_destructible<tuple<uint64_t, uint64_t> >::value << endl;
    cout << "tuple is trivially_move_constructible? : " << is_trivially_move_constructible<tuple<uint64_t, uint64_t> >::value << endl;
    return 0;
}

Output when compiled with GCC or Clang:

pair is trivial? : 0
pair is trivially_copy_constructible? : 1
pair is standard_layout? : 1
pair is pod? : 0
pair is trivially_destructable? : 1
pair is trivially_move_constructible? : 1
tuple is trivial? : 0
tuple is trivially_copy_constructible? : 1
tuple is standard_layout? : 0
tuple is pod? : 0
tuple is trivially_destructable? : 1
tuple is trivially_move_constructible? : 0

This implies that GCC is getting it wrong. The return value should be passed in %rax,%rdx.

(The main noticable differences between the types is that pair is standard layout and is trivially move-constructible whereas tuple is not, so it's possible that GCC is always returning non-trivially-move-constructible values via a pointer, for example).

138

answered Sep 21 '22 15:09

davmac

As davmac's answer shows, the libstdc++ std::tuple is trivially copy constructible, but not trivially move constructible. The two compilers disagree on whether the move constructor should affect the argument passing conventions.

The C++ ABI thread you linked to seems to explain that disagreement: http://sourcerytools.com/pipermail/cxx-abi-dev/2016-February/002891.html

In summary, Clang implements exactly what the ABI spec says, but G++ implements what it was supposed to say, but wasn't updated to actually say.

answered Sep 17 '22 15:09

Jonathan Wakely

Related questions
                            
                                How do I exclude library headers from my Visual Studio static code analysis?
                            
                                c++ why isn't there something like length(array)? [closed]
                            
                                Any performance penalty for wrapping int in a class?
                            
                                Which bitset implementation should I use for maximum performance?
                            
                                Mixing C++11 std::thread and C system threads (ie pthreads)
                            
                                C++11: Defining Function With Container Parameter (like range-based for)?
                            
                                Why vptr is not static?
                            
                                Detecting USB insertion/Removal in C++ non-GUI application
                            
                                Different casting operators used by different compilers
                            
                                How to change the C++ Runtime Library setting in QtCreator?
                            
                                How do I get BOOST_TEST_MESSAGE to display on the screen?
                            
                                glTexImage2d and Null data
                            
                                CMake Top Level Xcode Project Properties
                            
                                implement method with ref qualifier
                            
                                Why can swapping standard library containers be problematic in C++11 (involving allocators)?
                            
                                std::atomic_flag as member variable
                            
                                Memory allocator with custom pointer type
                            
                                What is the notification when the number of monitors changes?
                            
                                How to extract a selected set of arguments of a variadic function and use them to call another function
                            
                                New option in GCC 5.3: -fno-semantic-interposition

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

GCC/Clang x86_64 C++ ABI mismatch when returning a tuple?

Tags:

c++

tuples

x86-64

abi

compiler-bug

Related:

See also

jotik

People also ask

2 Answers

davmac

Jonathan Wakely

Recent Activity

Donate For Us