I stumbled across a code based on unions in C. Here is the code:
union {
struct {
char ax[2];
char ab[2];
} s;
struct {
int a;
int b;
} st;
} u ={12, 1};
printf("%d %d", u.st.a, u.st.b);
I just couldn't understand how come the output was 268 0. How were the values initialized?
How is the union functioning here? Shouldn't the output be 12 1. It would be great if anyone could explain what exactly is happening here in detail.
I am using a 32 bit processor and on Windows 7.
Structure members can be initialized using curly braces '{}'. For example, following is a valid initialization.
A union can be initialized on its declaration. Because only one member can be used at a time, only one can be initialized. To avoid confusion, only the first member of the union can be initialized.
When we define a struct (or class) type, we can provide a default initialization value for each member as part of the type definition. This process is called non-static member initialization, and the initialization value is called a default member initializer.
A structure contains an ordered group of data objects. Unlike the elements of an array, the data objects within a structure can have varied data types. Each data object in a structure is a member or field. A union is an object similar to a structure except that all of its members start at the same location in memory.
The code doesn't do what you think. Brace-initializes initialize the first union member, i.e. u.s. However, now the initializer is incomplete and missing braces, since u.s contains two arrays. It should be somethink like: u = { { {'a', 'b'}, { 'c', 'd' } } };
You should always compile with all warnings, a decent compiler should have told you that something was amiss. For instance, GCC says, missing braces around initialiser (near initialisation for ‘u.s’) and missing initialiser (near initialisation for ‘u.s.ab’). Very helpful.
In C99 you can take advantage of named member initialization to initialize the second union member: u = { .st = {12, 1} }; (This is not possible in C++, by the way.) The corresponding syntax for the first case is `u = { .s = { {'a', 'b'}, { 'c', 'd' } } };, which is arguably more explicit and readable!
Your code uses the default initializer for the union, which is its first member. Both 12 and 1 go into the characters of ax, hence the result that you see (which is very much compiler-dependent).
If you wanted to initialize through the second memmber (st) you would use a designated initializer:
union {
struct {
char ax[2];
char ab[2];
} s;
struct {
int a;
int b;
} st;
} u ={ .st = {12, 1}};
The code sets u.s.ax[0] to 12 and u.s.ax[1] to 1. u.s.ax is overlayed onto u.st.a so the least-significant byte of u.st.a is set to 12 and the most-significant byte to 1 (so you must be running on a little-endian architecture) giving a value of 0x010C or 268.
A union's size is the maximum size of the largest element that composes the union. So in this case, your union type has a size of 8-bytes on a 32-bit platform where int types are 4-bytes each. The first member of the union, s, though, only takes up 2-bytes, and therefore overlaps with the first 2-bytes of the st.a member. Since you are on a little-endian system, that means that we're overlapping the two lower-order bytes of st.a. Thus, when you initialize the union as it's done with the values {12, 1}, you've only initialized the values in the two lower-order bytes of st.a ... this leaves the value of st.b initialized to 0. Thus when you attempt to print out the struct containing the two int rather than char members of the union, you end up with your results of 128 and 0.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With