I was working with some (of what I thought was) bad code that had a union like:
union my_msg_union
{
struct message5;
char buffer[256]
} message;
The buffer was filled with 256 bytes from comms. The struct is something like:
struct message5 {
uint8 id;
uint16 size;
uint32 data;
uint8 num_ids;
uint16 ids[4];
} message5d
The same code was being compiled on heaps of architectures (8bit AVR, 16bit phillips, 32bit arm, 32bit x86 and amd64).
The problem I thought was the use of the union: The code just a blob of serial recieved bytes into the buffer, then reads the values out through the struct, without considering alignment/padding of the struct.
Sure enough, a quick look at sizeof(message5d) on different systems gave different results.
What surprised me however is that whenever the union with the char [] existed, all instances of all structs of that type, on all systems, dropped their padding/alignment, and made sure to be sequential bytes.
Is this a C standard or just something that compiler authors have put in to 'help'?
This code demonstrates the opposite behaviour from the one you describe:
#include <stddef.h>
#include <stdint.h>
#include <stdio.h>
struct message5
{
uint8_t id;
uint16_t size;
uint32_t data;
uint8_t num_ids;
uint16_t ids[4];
};
#if !defined(NO_UNION)
union my_msg_union
{
struct message5 msg;
char buffer[256];
};
#endif /* NO_UNION */
struct data
{
char const *name;
size_t offset;
};
int main(void)
{
struct data offsets[] =
{
{ "message5.id", offsetof(struct message5, id) },
{ "message5.size", offsetof(struct message5, size) },
{ "message5.data", offsetof(struct message5, data) },
{ "message5.num_ids", offsetof(struct message5, num_ids) },
{ "message5.ids", offsetof(struct message5, ids) },
#if !defined(NO_UNION)
{ "my_msg_union.msg.id", offsetof(union my_msg_union, msg.id) },
{ "my_msg_union.msg.size", offsetof(union my_msg_union, msg.size) },
{ "my_msg_union.msg.data", offsetof(union my_msg_union, msg.data) },
{ "my_msg_union.msg.num_ids", offsetof(union my_msg_union, msg.num_ids) },
{ "my_msg_union.msg.ids", offsetof(union my_msg_union, msg.ids) },
#endif /* NO_UNION */
};
enum { NUM_OFFSETS = sizeof(offsets) / sizeof(offsets[0]) };
for (size_t i = 0; i < NUM_OFFSETS; i++)
printf("%-25s %3zu\n", offsets[i].name, offsets[i].offset);
return 0;
}
Sample output (GCC 4.8.2 on Mac OS X 10.9 Mavericks, 64-bit compilation):
message5.id 0
message5.size 2
message5.data 4
message5.num_ids 8
message5.ids 10
my_msg_union.msg.id 0
my_msg_union.msg.size 2
my_msg_union.msg.data 4
my_msg_union.msg.num_ids 8
my_msg_union.msg.ids 10
The offsets within the union are the same as the offsets within the structure, as the C standard requires.
You would have to give a complete compiling counter-example based on the code above, and specify which compiler and platform you are compiling on to get your deviant answer — if indeed you can reproduce the deviant answer.
I note that I had to change uint8
etc to uint8_t
, but I don't think that makes any difference. If it does, you need to specify which header you get the names like uint8
from.
Code updated to be compilable with or without union
. Output when compiled with -DNO_UNION
:
message5.id 0
message5.size 2
message5.data 4
message5.num_ids 8
message5.ids 10
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With