Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is GCC miscompiling this code, or is it UB?

Consider this code:

#include <cstring>

template<typename T>
struct DefaultMsgImpl
{
    DefaultMsgImpl() { memset(this, 0, sizeof(T)); }
};

struct Msg : DefaultMsgImpl<Msg>
{
    int num;
};

int f()
{
    Msg msg{.num = 66};
    return msg.num;
}

With GCC 13 (and older), f() returns 0, but with all other compilers (MSVC, Clang, ICC, ICX, ...), f() returns 66.

Is the above code valid C++ (in which case it seems like GCC is miscompiling it), or is it undefined behavior (e.g. because of the memset touching the memory where Msg will live before T's lifetime begins)? If it is UB, I'd appreciate an answer citing the C++ standard.

Demo: https://godbolt.org/z/1qf9dv8cG

like image 388
John Zwinck Avatar asked Sep 13 '25 02:09

John Zwinck


1 Answers

It's an UB what happens in this case, regardles of method you set Msg. Accessing pure virtual methods overridden by derived class or accessing derived class's non-static members from constructor of base class is an UB.

Formally type T a.k.a. Msg is not constructed yet. So while doing something like this from a method after Msgs initialization is (relatively) fine and is the core of CRTP pattern, doing this from constructor is not.

Note, extended initalizers {.num = 66}; were a C99 feature but had preliminary historical support in some C++ compilers (or while in C++ mode in case of bilanguial compilers), which makes this code formally ill-formed until ISO C++20. In -pedantic mode GCC would warn about this. How exactly the extension should work when mixed with constructors, never was defined. De-facto, GCC first performs initialization, and then runs constructor's body.

Designated initialization is aggregate initialization. In ISO C++20 it's allowed only if there is no inherited or user-defined constructor present, which isn't the case here. E.g. on GCC's version of GNU-C++17 that appears not a requirement.

By ISO standard the elements of an non-union aggregate can be either default-initialized or initialized explicitly. Otherwise code is ill-formed, no diagnostics required. (9.4.5 Aggregate Initializers n 4868). You circumvent compiler's possible but unrequred complaints by using memset.

If there is a certain egg-or-chicken problem, the pattern can be extended by a base class which would hold all non-static members and definitions required by CRTP base:

template <typename T>
struct MsgTrait {};

template<typename T>
struct DefaultMsgImpl : MsgTrait<T>
{
    DefaultMsgImpl(int num) { this->num = num; }
};

// Msg
struct Msg;
template <>
struct MsgTrait<Msg> {
    int num;
    //something else?
};    

struct Msg : DefaultMsgImpl<Msg>
{
    Msg(int MsgNumber) : DefaultMsgImpl(MsgNumber) {}
};

int f()
{
    // Msg msg{.num = 66}; cannot do that. And Msg cannot be trivially constructed
    Msg msg{66};
    return msg.num;
}

MsgTrait<Msg> would be constructed and initialized first, and is considered a complete type, with all its nested types if any present, as long as its specialization is defined before attemption to instantiate DefaultMsgImpl<Msg>

like image 103
Swift - Friday Pie Avatar answered Sep 15 '25 17:09

Swift - Friday Pie