Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Active member of an union, uniform initialization and constructors

As the (Working Draft of) C++ Standard says:

9.5.1 [class.union]

In a union, at most one of the non-static data members can be active at any time, that is, the value of at most one of the non-static data members can be stored in a union at any time. [...] The size of a union is sufficient to contain the largest of its non-static data members. Each non-static data member is allocated as if it were the sole member of a struct. All non-static data members of a union object have the same address.

But I don't know how to identify which is the active member of an union and I'm not used enough to dive into the standard to locate what the standard says about it, I've tried to figure how the active member is setted but I've found how it is swapped:

9.5.4 [class.union]

[ Note: In general, one must use explicit destructor calls and placement new operators to change the active member of a union. —end note ] [Example: Consider an object u of a union type U having non-static data members m of type M and n of type N. If M has a non-trivial destructor and N has a non-trivial constructor (for instance, if they declare or inherit virtual functions), the active member of u can be safely switched from m to n using the destructor and placement new operator as follows:

u.m.~M();
new (&u.n) N;

end example ]

So my guess is that the active member of an union is the one first asigned, used, constructed or placement-new'ed; but this becomes kind of tricky with uniform initialization, consider the following code:

union Foo
{
    struct {char a,b,c,d;};
    char array[4];
    int integer;
};

Foo f; // default ctor
std::cout << f.a << f.b << f.c << f.d << '\n';

Which is the active member of the union on the code above? Is std::cout reading from the active member of the union? What about the code below?

Foo f{0,1,2,3}; // uniform initialization
std::cout << f.a << f.b << f.c << f.d << '\n';

With the lines above we can initialize the nested anonymous struct or either the array, if I provide only an integer I can initialize Foo::a or Foo::array or Foo::integer... which one would be the active member?

Foo f{0}; // uniform initialization
std::cout << f.integer << '\n';

I guess that the active member would be the aninymous struct in all of the above cases but I'm not sure.

If I want to activate one or the other union member, should I provide a constructor activating it?

union Bar
{
    // #1 Activate anonymous struct
    Bar(char x, char y, char z, char t) : a(x),b(y),c(z),d(t) {}
    // #2 Activate array
    Bar(char (&a)[4]) { std::copy(std::begin(a), std::end(a), std::begin(array)); }
    // #3 Activate integer
    Bar(int i) : integer(i) {}

    struct {char a,b,c,d;};
    char array[4];
    int integer;
};

I'm almost sure that #1 and #3 will mark as active union the anonymous struct and the integer but I don't know about the #2 because in the moment we reach the body of the constructor the members are already constructed! so are we calling std::copy over an inactive union member?

Questions:

  • Which are the active union members of Foo if it is constructed with the following uniform initialization:
    • Foo{};
    • Foo{1,2,3,4};
    • Foo{1};
  • In the #2 constructor of Bar the Bar::array is the active union member?
  • Where in the standard can I read about which is exactly the active union member and how to set it without placement new?
like image 878
PaperBirdMaster Avatar asked Jul 13 '15 16:07

PaperBirdMaster


People also ask

How many union members can be initialised?

A union can be initialized on its declaration. Because only one member can be used at a time, only one can be initialized. To avoid confusion, only the first member of the union can be initialized.

Can a C++ union have a constructor?

A union can have member functions (including constructors and destructors), but not virtual functions. A union cannot have base classes and cannot be used as a base class. A union cannot have non-static data members of reference types.

How to initialize union in c++?

A union can have a constructor to initialize any of its members. A union without a constructor can be initialized with another union of the same type, with an expression of the type of the first member of the union, or with an initializer (enclosed in braces) of the type of the first member of the union.

How to initialize struct in c++ initializer list?

When initializing an object of struct or union type, the initializer must be a non-empty, (until C23) brace-enclosed, comma-separated list of initializers for the members: = { expression , ... }


1 Answers

Your concern about the lack of a rigorous definition of the active member of a union is shared by (at least some of) the members of the standardization committee - see the latest note (dated May 2015) in the description of active issue 1116:

We never say what the active member of a union is, how it can be changed, and so on. [...]

I think we can expect some sort of clarification in future versions of the working draft. That note also indicates that the best we have so far is the note in the paragraph you quoted in your question, [9.5p4].

That being said, let's look at your other questions.

First of all, there are no anonymous structs in standard C++ (only anonymous unions); struct {char a,b,c,d;}; will give you warnings if compiled with reasonably strict options (-std=c++1z -Wall -Wextra -pedantic for Clang and GCC, for example). Going forward, I'll assume we have a declaration like struct { char a, b, c, d; } s; and everything else is adjusted accordingly.

The implicitly defaulted default constructor in your first example doesn't perform any initialization according to [12.6.2p9.2]:

In a non-delegating constructor, if a given potentially constructed subobject is not designated by a mem-initializer-id (including the case where there is no mem-initializer-list because the constructor has no ctor-initializer), then

(9.1) - if the entity is a non-static data member that has a brace-or-equal-initializer and either

(9.1.1) - the constructor’s class is a union (9.5), and no other variant member of that union is designated by a mem-initializer-id or
(9.1.2) - the constructor’s class is not a union, and, if the entity is a member of an anonymous union, no other member of that union is designated by a mem-initializer-id,

the entity is initialized as specified in 8.5;

(9.2) - otherwise, if the entity is an anonymous union or a variant member (9.5), no initialization is performed;

(9.3) - otherwise, the entity is default-initialized (8.5).

I suppose we could say that f has no active member after its default constructor has finished executing, but I don't know of any standard wording that clearly indicates that. What can be said in practice is that it makes no sense to attempt to read the value of any of f's members, since they're indeterminate.

In your next example, you're using aggregate initialization, which is reasonably well-defined for unions according to [8.5.1p16]:

When a union is initialized with a brace-enclosed initializer, the braces shall only contain an initializer-clause for the first non-static data member of the union. [ Example:

union u { int a; const char* b; }; 
u a = { 1 }; 
u b = a; 
u c = 1;               // error 
u d = { 0, "asdf" };   // error 
u e = { "asdf" };      // error 

end example ]

That, together with brace elision for the initialization of the nested struct, as specified in [8.5.1p12], makes the struct the active member. It answers your next question as well: you can only initialize the first union member using that syntax.

Your next question:

If I want to activate one or the other union member, should I provide a constructor activating it?

Yes, or a brace-or-equal-initializer for exactly one member according to [12.6.2p9.1.1] quoted above; something like this:

union Foo
{
    struct { char a, b, c, d; } s;
    char array[4];
    int integer = 7;
};

Foo f;

After the above, the active member will be integer. All of the above should also answer your question about #2 (the members are not already constructed when we reach the body of the constructor - #2 is fine as well).

Wrapping up, both Foo{} and Foo{1} perform aggregate initialization; they're interpreted as Foo{{}} and Foo{{1}}, respectively, (because of brace elision), and initialize the struct; the first one sets all the struct members to 0 and the second one sets the first member to 1 and the rest to 0, according to [8.5.1p7].


All standard quotes are from the current working draft, N4527.


Paper N4430, which deals with somewhat related issues, but hasn't been integrated into the working draft yet, provides a definition for active member:

In a union, a non-static data member is active if its name refers to an object whose lifetime has begun and has not ended ([basic.life]).

This effectively passes the buck to the definition of lifetime in [3.8], which also has a few issues open against it, including the aforementioned issue 1116, so I think we'll have to wait for several such issues to be resolved in order to have a complete and consistent definition. The definition of lifetime as it currently stands doesn't seem to be quite ready.

like image 160
bogdan Avatar answered Oct 06 '22 01:10

bogdan