Given the following struct...
#include <type_traits>
struct C {
long a[16]{};
long b[16]{};
C() = default;
};
// For godbolt
C construct() {
static_assert(not std::is_trivial_v<C>);
static_assert(std::is_standard_layout_v<C>);
C c;
return c;
}
...gcc (version 10.2 on x86-64 Linux) with enabled optimization (at all 3 levels) produces the following assembly[1] for construct
:
construct():
mov r8, rdi
xor eax, eax
mov ecx, 32
rep stosq
mov rax, r8
ret
Once I provide empty default constructor...
#include <type_traits>
struct C {
long a[16]{};
long b[16]{};
C() {} // <-- The only change
};
// For godbolt
C construct() {
static_assert(not std::is_trivial_v<C>);
static_assert(std::is_standard_layout_v<C>);
C c;
return c;
}
...generated assembly changes to initializing every field individually instead of single memset in the original:
construct():
mov rdx, rdi
mov eax, 0
mov ecx, 16
rep stosq
lea rdi, [rdx+128]
mov ecx, 16
rep stosq
mov rax, rdx
ret
Apparently, both structs are equivalent in terms of not being trivial, but being standard layout. Is it just gcc missing an optimization opportunity, or is there more to it from the C++-the-language perspective?
The example is a stripped down version of production code where this did have material difference in performance.
[1] Godbolt: https://godbolt.org/z/8n1Mae
If the implicitly-declared default constructor is not defined as deleted, it is defined (that is, a function body is generated and compiled) by the compiler if odr-used, and it has exactly the same effect as a user-defined constructor with empty body and empty initializer list.
If you are looking for how do we declare a default constructor implicitly, here are details. In C++ programming, If there is no constructor in the class (in the struct or in the union), the C++ compiler will always declare a default constructor as an inline public member of that class.
If there is a declared default constructor, we can force the automatic generation of a default constructor in a new class by the compiler that would be implicitly declared otherwise with the keyword default.
If no user-declared constructors of any kind are provided for a class type ( struct, class, or union ), the compiler will always declare a default constructor as an inline public member of its class.
While I agree that this seems like a missed optimization opportunity, I noticed one difference from the language level perspective. The implicitly-defined constructor is constexpr
while the empty default constructor in your example is not. From cppreference.com:
That is, [the implicitly-defined constructor] calls the default constructors of the bases and of the non-static members of this class. If this satisfies the requirements of a constexpr constructor, the generated constructor is constexpr (since C++11).
So as the initialization of the arrays of long
is constexpr
, the implicitly-defined constructor is as well. However, the user-defined one is not, as it is not marked constexpr
. We can also confirm this by trying to make the construct
function of the example constexpr
. For the implicitly-defined constructor this works without any problems, but for the empty user-defined version it fails to compile because
<source>:3:8: note: 'C' is not an aggregate, does not have a trivial default constructor, and has no 'constexpr' constructor that is not a copy or move constructor
as we can see here: https://godbolt.org/z/MnsbzKv1v
So to fix this difference we can make the empty user-defined constructor constexpr
:
struct C {
long a[16]{};
long b[16]{};
constexpr C() {}
};
Somewhat surprisingly, gcc now generates the optimized version, i.e. the exact same code as for the defaulted default constructor: https://godbolt.org/z/cchTnEhKW
I do not know why, but this difference in constexpr
ness actually seems to help the compiler in this case. So while it seems like gcc should be able to generate the same code without specifying constexpr
, I guess it is good to know that it can be beneficial.
As an additional test for this observation, we could try to make the implicitly-defined constructor non-constexpr
and see if gcc fails to do the optimization. One simple way that I can think of to try to test this is to have C
inherit from an empty class with a non-constexpr
default constructor:
struct D {
D() {}
};
struct C : D {
long a[16]{};
long b[16]{};
C() = default;
};
And indeed, this generates the assembly that initializes the fields individually again. Once we make D()
constexpr
, we get the optimized code back. See https://godbolt.org/z/esYhc1cfW.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With