When would anyone use a union? Is it a remnant from the C-only days?

Use case 1: the chameleon

With unions, you can regroup a number of arbitrary classes under one denomination, which isn't without similarities with the case of a base class and its derived classes. What changes, however, is what you can and can't do with a given union instance:

struct Batman;
struct BaseballBat;

union Bat
{
    Batman brucewayne;
    BaseballBat club;
};

ReturnType1 f(void)
{
    BaseballBat bb = {/* */};
    Bat b;
    b.club = bb;
    // do something with b.club
}

ReturnType2 g(Bat& b)
{
    // do something with b, but how do we know what's inside?
}

Bat returnsBat(void);
ReturnType3 h(void)
{
    Bat b = returnsBat();
    // do something with b, but how do we know what's inside?
}

It appears that the programmer has to be certain of the type of the content of a given union instance when he wants to use it. It is the case in function f above. However, if a function were to receive a union instance as a passed argument, as is the case with g above, then it wouldn't know what to do with it. The same applies to functions returning a union instance, see h: how does the caller know what's inside?

If a union instance never gets passed as an argument or as a return value, then it's bound to have a very monotonous life, with spikes of excitement when the programmer chooses to change its content:

Batman bm = {/* */};
Baseball bb = {/* */};
Bat b;
b.brucewayne = bm;
// stuff
b.club = bb;

And that's the most (un)popular use case of unions. Another use case is when a union instance comes along with something that tells you its type.

Use case 2: "Nice to meet you, I'm `object`, from `Class`"

Suppose a programmer elected to always pair up a union instance with a type descriptor (I'll leave it to the reader's discretion to imagine an implementation for one such object). This defeats the purpose of the union itself if what the programmer wants is to save memory and that the size of the type descriptor is not negligible with respect to that of the union. But let's suppose that it's crucial that the union instance could be passed as an argument or as a return value with the callee or caller not knowing what's inside.

Then the programmer has to write a switch control flow statement to tell Bruce Wayne apart from a wooden stick, or something equivalent. It's not too bad when there are only two types of contents in the union but obviously, the union doesn't scale anymore.

Use case 3:

As the authors of a recommendation for the ISO C++ Standard put it back in 2008,

Many important problem domains require either large numbers of objects or limited memory resources. In these situations conserving space is very important, and a union is often a perfect way to do that. In fact, a common use case is the situation where a union never changes its active member during its lifetime. It can be constructed, copied, and destructed as if it were a struct containing only one member. A typical application of this would be to create a heterogeneous collection of unrelated types which are not dynamically allocated (perhaps they are in-place constructed in a map, or members of an array).

And now, an example, with a UML class diagram:

many compositions for class A

The situation in plain English: an object of class A can have objects of any class among B1, ..., Bn, and at most one of each type, with n being a pretty big number, say at least 10.

We don't want to add fields (data members) to A like so:

private:
    B1 b1;
    .
    .
    .
    Bn bn;

because n might vary (we might want to add Bx classes to the mix), and because this would cause a mess with constructors and because A objects would take up a lot of space.

We could use a wacky container of void* pointers to Bx objects with casts to retrieve them, but that's fugly and so C-style... but more importantly that would leave us with the lifetimes of many dynamically allocated objects to manage.

Instead, what can be done is this:

union Bee
{
    B1 b1;
    .
    .
    .
    Bn bn;
};

enum BeesTypes { TYPE_B1, ..., TYPE_BN };

class A
{
private:
    std::unordered_map<int, Bee> data; // C++11, otherwise use std::map

public:
    Bee get(int); // the implementation is obvious: get from the unordered map
};

Then, to get the content of a union instance from data, you use a.get(TYPE_B2).b2 and the likes, where a is a class A instance.

This is all the more powerful since unions are unrestricted in C++11. See the document linked to above or this article for details.

One example is in the embedded realm, where each bit of a register may mean something different. For example, a union of an 8-bit integer and a structure with 8 separate 1-bit bitfields allows you to either change one bit or the entire byte.

Herb Sutter wrote in GOTW about six years ago, with emphasis added:

"But don't think that unions are only a holdover from earlier times. Unions are perhaps most useful for saving space by allowing data to overlap, and this is still desirable in C++ and in today's modern world. For example, some of the most advanced C++ standard library implementations in the world now use just this technique for implementing the "small string optimization," a great optimization alternative that reuses the storage inside a string object itself: for large strings, space inside the string object stores the usual pointer to the dynamically allocated buffer and housekeeping information like the size of the buffer; for small strings, the same space is instead reused to store the string contents directly and completely avoid any dynamic memory allocation. For more about the small string optimization (and other string optimizations and pessimizations in considerable depth), see... ."

And for a less useful example, see the long but inconclusive question gcc, strict-aliasing, and casting through a union.

Well, one example use case I can think of is this:

typedef union
{
    struct
    {
        uint8_t a;
        uint8_t b;
        uint8_t c;
        uint8_t d;
    };
    uint32_t x;
} some32bittype;

You can then access the 8-bit separate parts of that 32-bit block of data; however, prepare to potentially be bitten by endianness.

This is just one hypothetical example, but whenever you want to split data in a field into component parts like this, you could use a union.

That said, there is also a method which is endian-safe:

uint32_t x;
uint8_t a = (x & 0xFF000000) >> 24;

For example, since that binary operation will be converted by the compiler to the correct endianness.

Some uses for unions:

Provide a general endianness interface to an unknown external host.
Manipulate foreign CPU architecture floating point data, such as accepting VAX G_FLOATS from a network link and converting them to IEEE 754 long reals for processing.
Provide straightforward bit twiddling access to a higher-level type.

union {
      unsigned char   byte_v[16];
      long double     ld_v;
 }
With this declaration, it is simple to display the hex byte values of a long double, change the exponent's sign, determine if it is a denormal value, or implement long double arithmetic for a CPU which does not support it, etc.

Saving storage space when fields are dependent on certain values:

class person {  
    string name;  

    char gender;   // M = male, F = female, O = other  
    union {  
        date  vasectomized;  // for males  
        int   pregnancies;   // for females  
    } gender_specific_data;
}

Grep the include files for use with your compiler. You'll find dozens to hundreds of uses of union:

[wally@zenetfedora ~]$ cd /usr/include
[wally@zenetfedora include]$ grep -w union *
a.out.h:  union
argp.h:   parsing options, getopt is called with the union of all the argp
bfd.h:  union
bfd.h:  union
bfd.h:union internal_auxent;
bfd.h:  (bfd *, struct bfd_symbol *, int, union internal_auxent *);
bfd.h:  union {
bfd.h:  /* The value of the symbol.  This really should be a union of a
bfd.h:  union
bfd.h:  union
bfdlink.h:  /* A union of information depending upon the type.  */
bfdlink.h:  union
bfdlink.h:       this field.  This field is present in all of the union element
bfdlink.h:       the union; this structure is a major space user in the
bfdlink.h:  union
bfdlink.h:  union
curses.h:    union
db_cxx.h:// 4201: nameless struct/union
elf.h:  union
elf.h:  union
elf.h:  union
elf.h:  union
elf.h:typedef union
_G_config.h:typedef union
gcrypt.h:  union
gcrypt.h:    union
gcrypt.h:    union
gmp-i386.h:  union {
ieee754.h:union ieee754_float
ieee754.h:union ieee754_double
ieee754.h:union ieee854_long_double
ifaddrs.h:  union
jpeglib.h:  union {
ldap.h: union mod_vals_u {
ncurses.h:    union
newt.h:    union {
obstack.h:  union
pi-file.h:  union {
resolv.h:   union {
signal.h:extern int sigqueue (__pid_t __pid, int __sig, __const union sigval __val)
stdlib.h:/* Lots of hair to allow traditional BSD use of `union wait'
stdlib.h:  (__extension__ (((union { __typeof(status) __in; int __i; }) \
stdlib.h:/* This is the type of the argument to `wait'.  The funky union
stdlib.h:   causes redeclarations with either `int *' or `union wait *' to be
stdlib.h:typedef union
stdlib.h:    union wait *__uptr;
stdlib.h:  } __WAIT_STATUS __attribute__ ((__transparent_union__));
thread_db.h:  union
thread_db.h:  union
tiffio.h:   union {
wchar.h:  union
xf86drm.h:typedef union _drmVBlank {

Unions are useful when dealing with byte-level (low level) data.

One of my recent usage was on IP address modeling which looks like below :

// Composite structure for IP address storage
union
{
    // IPv4 @ 32-bit identifier
    // Padded 12-bytes for IPv6 compatibility
    union
    {
        struct
        {
            unsigned char _reserved[12];
            unsigned char _IpBytes[4];
        } _Raw;

        struct
        {
            unsigned char _reserved[12];
            unsigned char _o1;
            unsigned char _o2;
            unsigned char _o3;
            unsigned char _o4;    
        } _Octet;    
    } _IPv4;

    // IPv6 @ 128-bit identifier
    // Next generation internet addressing
    union
    {
        struct
        {
            unsigned char _IpBytes[16];
        } _Raw;

        struct
        {
            unsigned short _w1;
            unsigned short _w2;
            unsigned short _w3;
            unsigned short _w4;
            unsigned short _w5;
            unsigned short _w6;
            unsigned short _w7;
            unsigned short _w8;   
        } _Word;
    } _IPv6;
} _IP;

Related questions
                            
                                Explicit specialization in non-namespace scope [duplicate]
                            
                                Difference between string and char[] types in C++
                            
                                What is the preferred/idiomatic way to insert into a map?
                            
                                Conditions for automatic generation of default/copy/move ctor and copy/move assignment operator?
                            
                                Getting GDB to save a list of breakpoints
                            
                                How do I sort a vector of pairs based on the second element of the pair?
                            
                                Is it possible to figure out the parameter type and return type of a lambda?
                            
                                When should std::move be used on a function return value? [duplicate]
                            
                                Qt: How do I handle the event of the user pressing the 'X' (close) button?
                            
                                What is the closest thing Windows has to fork()?
                            
                                Mismatch Detected for 'RuntimeLibrary'
                            
                                Why do we need extern "C"{ #include <foo.h> } in C++?
                            
                                Using scanf() in C++ programs is faster than using cin?
                            
                                How to build Qt for Visual Studio 2010
                            
                                Is effective C++ still effective?
                            
                                Why is inequality tested as (!(a==b)) in a lot of C++ standard library code?
                            
                                how does array[100] = {0} set the entire array to 0?
                            
                                Default value of function parameter
                            
                                When and why will a compiler initialise memory to 0xCD, 0xDD, etc. on malloc/free/new/delete?
                            
                                Is it safe to use -1 to set all bits to true?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

When would anyone use a union? Is it a remnant from the C-only days?

Tags:

c++

c

unions

People also ask

Use case 1: the chameleon

Use case 2: "Nice to meet you, I'm `object`, from `Class`"

Use case 3:

Recent Activity

Donate For Us

When would anyone use a union? Is it a remnant from the C-only days?

Tags:

c++

c

unions

People also ask

Use case 1: the chameleon

Use case 2: "Nice to meet you, I'm object, from Class"

Use case 3:

Related questions

Recent Activity

Donate For Us

Use case 2: "Nice to meet you, I'm `object`, from `Class`"