I stumbled across a code based on unions in C. Here is the code:
union {
struct {
char ax[2];
char ab[2];
} s;
struct {
int a;
int b;
} st;
} u ={12, 1};
printf("%d %d", u.st.a, u.st.b);
I just couldn't understand how come the output was 268 0
. How were the values initialized?
How is the union functioning here? Shouldn't the output be 12 1
. It would be great if anyone could explain what exactly is happening here in detail.
I am using a 32 bit processor and on Windows 7.
Structure members can be initialized using curly braces '{}'. For example, following is a valid initialization.
A union can be initialized on its declaration. Because only one member can be used at a time, only one can be initialized. To avoid confusion, only the first member of the union can be initialized.
When we define a struct (or class) type, we can provide a default initialization value for each member as part of the type definition. This process is called non-static member initialization, and the initialization value is called a default member initializer.
A structure contains an ordered group of data objects. Unlike the elements of an array, the data objects within a structure can have varied data types. Each data object in a structure is a member or field. A union is an object similar to a structure except that all of its members start at the same location in memory.
The code doesn't do what you think. Brace-initializes initialize the first union member, i.e. u.s
. However, now the initializer is incomplete and missing braces, since u.s
contains two arrays. It should be somethink like: u = { { {'a', 'b'}, { 'c', 'd' } } };
You should always compile with all warnings, a decent compiler should have told you that something was amiss. For instance, GCC says, missing braces around initialiser (near initialisation for ‘u.s’)
and missing initialiser (near initialisation for ‘u.s.ab’)
. Very helpful.
In C99 you can take advantage of named member initialization to initialize the second union member: u = { .st = {12, 1} };
(This is not possible in C++, by the way.) The corresponding syntax for the first case is `u = { .s = { {'a', 'b'}, { 'c', 'd' } } };
, which is arguably more explicit and readable!
Your code uses the default initializer for the union, which is its first member. Both 12 and 1 go into the characters of ax, hence the result that you see (which is very much compiler-dependent).
If you wanted to initialize through the second memmber (st
) you would use a designated initializer:
union {
struct {
char ax[2];
char ab[2];
} s;
struct {
int a;
int b;
} st;
} u ={ .st = {12, 1}};
The code sets u.s.ax[0]
to 12 and u.s.ax[1]
to 1. u.s.ax
is overlayed onto u.st.a
so the least-significant byte of u.st.a
is set to 12 and the most-significant byte to 1 (so you must be running on a little-endian architecture) giving a value of 0x010C or 268.
A union's size is the maximum size of the largest element that composes the union. So in this case, your union type has a size of 8-bytes on a 32-bit platform where int
types are 4-bytes each. The first member of the union, s
, though, only takes up 2-bytes, and therefore overlaps with the first 2-bytes of the st.a
member. Since you are on a little-endian system, that means that we're overlapping the two lower-order bytes of st.a
. Thus, when you initialize the union as it's done with the values {12, 1}
, you've only initialized the values in the two lower-order bytes of st.a
... this leaves the value of st.b
initialized to 0
. Thus when you attempt to print out the struct containing the two int
rather than char
members of the union, you end up with your results of 128
and 0
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With