Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Writing to Unions, with gcc

Tags:

c++

gcc

g++

unions

Consider the following types:

struct A { int x; };
struct B { int y; char z; };
union U { A a; B b; };

And this code fragment:

U u;
new (&u.b) B;
b.y = 42;
b.z = 'x';

At this point, reading from u.a.x is well-defined behavior (and will yield 42) because it's in the common initial sequence. Writing to u.a.x is undefined behavior.

But gcc allows type-punning through a union - the docs explicitly allow reading the inactive member, regardless of member sequencing. Does gcc allow writing to an inactive member, even the common initial sequence, or would this still be undefined behavior on gcc? That is, if at this point, I had:

void write(A& arg) { a.x = 17; }
write(&u.a); // generally undefined behavior, since active member is u.b
             // but does gcc allow it?
f(u.b.y);    // is this definitely f(17)?
g(u.b.z);    // ... and is this definitely g('x')?
like image 306
Barry Avatar asked May 08 '17 15:05

Barry


1 Answers

The behavior you are describing (called “type-punning”) is allowed but is undefined by the C++ specification (see the note below). It might be defined under specific compiler and hardware. More specifically, on gcc and x86 with simple types (char, short, float, double, ...) this will act as reinterpret cast between the different fields.

...The practice of reading from a different union member than the one most recently written to (called “type-punning”) is common. Even with -fstrict-aliasing, type-punning is allowed, provided the memory is accessed through the union type. (source)

Furthermore, sometimes it is useful, for example when reading from a device (e.g. socket):

union {
    struct {
        int a;
        char b;
        short c;
    } data;

    char buf[128];
} u;

read_from_device(u.buf, 128);
printf("Data (a,b,c): (%d,%d,%d)\n", u.data.a, u.data.b, u.data.c);

First, we read the raw data from the device, then we use the struct to reinterpret it as numbers. We often use #pragma pack on a struct to ensure the data is packed the same way it is packed on the device.

Note on Active Member

The determination of the active member is implicit and determined solely by the programmer, not the compiler. It is up to you to know which member is active. The lifetime of the field starts when you assign to it.

...the beginning of its lifetime is sequenced after the value computation of the left and right operands and before the assignment.

The fact that the manual specifies that writing to a non-active field makes it active suggest that this is allowed.

See this for more information:

Member lifetime

The lifetime of a union member begins when the member is made active. If another member was active previously, its lifetime ends.

When active member of a union is switched by an assignment expression of the form E1 = E2 that uses either the built-in assignment operator or a trivial assignment operator, for each union member X that appears in the member access and array subscript subexpressions of E1 that is not a class with non-trivial or deleted default constructors, if modification of X would have undefined behavior under type aliasing rules, an object of the type of X is implicitly created in the nominated storage; no initialization is performed and the beginning of its lifetime is sequenced after the value computation of the left and right operands and before the assignment.

union A { int x; int y[4]; };
struct B { A a; };
union C { B b; int k; };
int f() {
  C c;               // does not start lifetime of any union member
  c.b.a.y[3] = 4;    // OK: "c.b.a.y[3]", names union members c.b and c.b.a.y;
                     // This creates objects to hold union members c.b and c.b.a.y
  return c.b.a.y[3]; // OK: c.b.a.y refers to newly created object
}

struct X { const int a; int b; };
union Y { X x; int k; };
void g() {
  Y y = { { 1, 2 } }; // OK, y.x is active union member (9.2)
  int n = y.x.a;
  y.k = 4;   // OK: ends lifetime of y.x, y.k is active member of union
  y.x.b = n; // undefined behavior: y.x.b modified outside its lifetime,
             // "y.x.b" names y.x, but X's default constructor is deleted,
             // so union member y.x's lifetime does not implicitly start
}
like image 50
Liran Funaro Avatar answered Nov 14 '22 19:11

Liran Funaro