Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is it legal to use address of one field of a union to access another field?

Tags:

c++

Consider following code:

union U
{
    int a;
    float b;
};

int main()
{
    U u;
    int *p = &u.a;
    *(float *)p = 1.0f; // <-- this line
}

We all know that addresses of union fields are usually same, but I'm not sure is it well-defined behavior to do something like this.

So, question is: Is it legal and well-defined behavior to cast and dereference a pointer to union field like in the code above?


P.S. I know that it's more C than C++, but I'm trying to understand if it's legal in C++, not C.

like image 937
HolyBlackCat Avatar asked Oct 10 '15 16:10

HolyBlackCat


People also ask

How do you access the elements of the union?

Access members of a union We use the . operator to access members of a union. And to access pointer variables, we use the -> operator.

What are the restrictions that must be observed when using C++ unions?

A union cannot have base classes and cannot be used as a base class. A union cannot have non-static data members of reference types. Unions cannot contain a non-static data member with a non-trivial special member function (copy constructor, copy-assignment operator, or destructor).

Can you declare struct and union one inside another?

A structure can be nested inside a union and it is called union of structures. It is possible to create a union inside a structure.

Can a structure be a member of union?

Both structure and union are the custom data types that store different types of data together as a single entity. The structure and union members can be objects of any type, such as other structures, unions, or arrays.


2 Answers

All members of a union must reside at the same address, that is guaranteed by the standard. What you are doing is indeed well-defined behavior, but it shall be noted that you cannot read from an inactive member of a union using the same approach.

  • Accessing inactive union member - undefined behavior?

Note: Do not use c-style casts, prefer reinterpret_cast in this case.


As long as all you do is write to the other data-member of the union, the behavior is well-defined; but as stated this changes which is considered to be the active member of the union; meaning that you can later only read from that you just wrote to.

union U {
    int a;
    float b;
};

int main () {
    U u;
    int *p = &u.a;
    reinterpret_cast<float*> (p) = 1.0f; // ok, well-defined
}

Note: There is an exception to the above rule when it comes to layout-compatible types.


The question can be rephrased into the following snippet which is semantically equivalent to a boiled down version of the "problem".

#include <type_traits>
#include <algorithm>
#include <cassert>

int main () {
  using union_storage_t = std::aligned_storage<
    std::max ( sizeof(int),   sizeof(float)),
    std::max (alignof(int),  alignof(float))
  >::type;

  union_storage_t u;

  int   * p1 = reinterpret_cast<  int*> (&u);
  float * p2 = reinterpret_cast<float*> (p1);
  float * p3 = reinterpret_cast<float*> (&u);

  assert (p2 == p3); // will never fire
}

What does the Standard (n3797) say?

9.5/1    Unions    [class.union]

In a union, at most one of the non-static data members can be active at any time, that is, the value of at most one of the non-static dat amembers ca nbe stored in a union at any time. [...] The size of a union is sufficient to contain the largest of its non-static data members. Each non-static data member is allocated as if it were the sole member of a struct. All non-static data members of a union object have the same address.

Note: The wording in C++11 (n3337) was underspecified, even though the intent has always been that of C++14.

like image 90
Filip Roséen - refp Avatar answered Oct 16 '22 03:10

Filip Roséen - refp


Yes, it is legal. Using explicit casts, you can do almost anything.

As other comments have stated, all members in a union start at the same address / location so casting a pointer to a different member is pointless.

The assembly language will be the same. You want to make the code easy to read so I don't recommend the practice. It is confusing and there is no benefit.

Also, I recommend a "type" field so that you know when the data is in float format versus int format.

like image 3
Thomas Matthews Avatar answered Oct 16 '22 01:10

Thomas Matthews