Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Union correct usage

Tags:

c++

c++11

My understanding of a union is all its values are allocated in the same memory address and the memory space is as large as the largest member of the union. But I don't understand how we would actually use them. This is a code where using a union is preferable according to The C++ Programming Language.

enum Type { str, num };

struct Entry {
     char* name;
     Type t;
     char* s;  // use s if t==str
     int i;    // use i if t==num
};

void f(Entry* p)
{
     if (p->t == str)
           cout << p->s;
     // ...
}

After this Bjarne says:

The members s and i can never be used at the same time, so space is wasted. It can be easily recovered by specifying that both should be members of a union, like this: union Value { char* s; int i; }; The language doesn’t keep track of which kind of value is held by a union, so the programmer must do that: struct Entry { char* name; Type t; Value v; // use v.s if t==str; use v.i if t==num }; void f(Entry* p) { if (p->t == str) cout v.s; // ... }

Can anyone explain the resulting union code further? what will actually happen if we transform this into a union?

like image 466
lightning_missile Avatar asked Feb 11 '23 08:02

lightning_missile


1 Answers

Let's say you have a 32-bit machine, with 32-bit integers and pointers. Your struct might then look like this:

[0-3] name
[4-7] type
[8-11] string
[12-15] integer

That's 16 bytes, but since type (t in your code) determines which field is valid, we never need to actually store the string and integer fields at the same time. So we can change the code:

struct Entry {
  char* name;
  Type t;
  union {
    char* s;  // use s if t==str
    int i;    // use i if t==num
  } u;
};

Now the layout is:

[0-3] name
[4-7] type
[8-11] string
[8-11] integer

In C++, whatever you assigned to most recently is the "valid" member of the union, but there is no way to know which one that is intrinsically, so you must store it yourself. This technique is often called a "discriminated union", the "discriminator" being the type field.

So the second struct takes 12 bytes instead of 16. If you're storing lots of them, or if they come from a network or disk, you might care about this. Otherwise, it's not really important.

like image 168
John Zwinck Avatar answered Feb 13 '23 22:02

John Zwinck