Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Unrestricted union in practice

Tags:

c++

c++11

unions

I have some questions about unrestricted unions and their application in practice. Let's suppose I have the following code :

struct MyStruct
{
    MyStruct(const std::vector<int>& a) : array(a), type(ARRAY)
    {}
    MyStruct(bool b) : boolean(b), type(BOOL)
    {}
    MyStruct(const MyStruct& ms) : type(ms.type)
    {
        if (type == ARRAY)
            new (&array) std::vector<int>(ms.array);
        else
            boolean = ms.boolean;
    }
    MyStruct& operator=(const MyStruct& ms)
    {
        if (&ms != this) {
            if (type == ARRAY)
                array.~vector<int>(); // EDIT(2) 
            if (ms.type == ARRAY)
                new (&array) std::vector<int>(ms.array);
            else
                boolean = ms.boolean;
            type = ms.type;
        }
        return *this;
    }
    ~MyStruct()
    {
        if (type == ARRAY)
            array.~vector<int>();
    }

    union {
        std::vector<int> array;
        bool             boolean;
    };
    enum {ARRAY, BOOL} type;
};
  1. Is this code valid :) ?
  2. Is it necessary to explicitly call the vector destructor each time we are using the boolean (as stated here http://cpp11standard.blogspot.com/2012/11/c11-standard-explained-1-unrestricted.html)
  3. Why a placement new is required instead of just doing something like 'array = ms.array' ?

EDIT:

  • Yes, it compiles
  • "Members declared inside anonymous unions are actually members of the containing class, and can be initialized in the containing class's constructor." (C++11 anonymous union with non-trivial members)
  • Adding explicit destructors as suggested, leads to SIGSEV with g++ 4.8 / clang 4.2
like image 947
3XX0 Avatar asked May 21 '13 05:05

3XX0


2 Answers

  1. The code's buggy: change array.clear(); to array.~vector<int>();

Explanation: operator= is using placement new over an object that hasn't been destructed, which could do anything but practically you can expect it to leak the dynamic memory the previous array had been using (clear() doesn't release memory / change capacity, it just destructs elements and changes size).

From 9.5/2:

If any non-static data member of a union has a non-trivial default constructor (12.1), copy constructor (12.8), move constructor (12.8), copy assignment operator (12.8), move assignment operator (12.8), or destructor (12.4), the corresponding member function of the union must be user-provided or it will be implicitly deleted (8.4.3) for the union.

So, the vector constructor, destructor etc never kicks in by themselves: you must call them explicitly when wanted.

In 9.5/3 there's an example:

Consider the following union:

union U {
    int i;
    float f;
    std::string s;
};

Since std::string (21.3) declares non-trivial versions of all of the special member functions, U will have an implicitly deleted default constructor, copy/move constructor, copy/move assignment operator, and destructor. To use U, some or all of these member functions must be user-provided.

That last bit - "To use U, some or all of these member functions must be user-provided." - seems to presume that U needs to coordinate its own vaguely value-semantic behaviour, but in your case the surrouding struct is doing that so you don't need to define any of these union member functions.

2: we must call the array destructor whenever an array value is being replaced by a boolean value. If in operator= a new array value is being placement-newed instead of assigned, then the old array must also have its destructor called, but using operator= would be more efficient when the existing memory is sufficient for all the elements being copied. Basically, you must match constructions and destructions. UPDATE: the example code has a bug as per your comment below.

3: Why a placement new is required instead of just doing something like 'array = ms.array' ?

array = ms.array invokes std::vector<int>::operator= which always assumes the this pointer addresses an already properly constructed object. Inside that object you can expect there to be a pointer which will either be NULL or refer to some internal short-string buffer, or refer to heap. If your object hasn't been destructed, then operator= may well call a memory deallocation function on the bogus pointer. Placement new says "ignore the current content of the memory that this object will occupy, and construct a new object with valid members from scratch.

like image 164
Tony Delroy Avatar answered Nov 08 '22 17:11

Tony Delroy


The union does not declare a default constructor, copy constructor, copy assignment operator, or destructor.

If std::string declares at least one non-trivial version of a special member function (which is the case), the forementioned ones are all implicitly deleted, and you must declare (and define) them (... if they're used, which is the case).

Insofar, that code isn't correct and should not successfully compile (this is almost-to-the-letter identical to the example in 9.5 par 3 of the standard, except there it's std::string, not std::vector).
(Does not apply for an anon union, as correctly pointed out)

About question (2): In order to safely switch the union, this is necessary, yes. The Standard explicitly says that in 9.5 par 4 [Note].
It makes sense too, if you think about it. At most one data member can be active in a union at any time, and they're not magically default constructed/destroyed, which means you need to properly construct/destruct things. It is not meaningful (or even defined) to use the union as something else otherwise (not that you couldn't do that anyway, but it's undefined).
The object is not a pointer, and you don't know whether it's allocated on the heap either (even if it is allocated on the heap, then it's inside another object, so it's still not allowable to delete it). How do you destroy an object if you can't call delete? How do you allocate an object -- possibly several times -- without leaking if you can't delete it? This doesn't leave many choices. Insofar, the [Note] makes perfect sense.

like image 38
Damon Avatar answered Nov 08 '22 19:11

Damon