Our team is currently using some ported code from an old architecture to a new product based on the ARM Cortex M3 platform using a customized version of GCC 4.5.1. We are reading data from a communications link, and attempting to cast the raw byte array to a struct to cleanly parse the data. After casting the pointer to a struct and dereferencing, we are getting a warning: "dereferencing type-punned pointer will break strict-aliasing rules".
After some research, I've realized that since the char array has no alignment rules and the struct have to be word aligned, casting the pointers causes undefined behavior (a Bad Thing). I'm wondering if there is a better way to do what we're trying.
I know we can explicitly word-align the char array using GCC's "attribute ((aligned (4)))". I believe this will make our code "safer", but the warnings will still clutter up our builds, and I don't want to disable the warnings in case this situation arises again. What we want is a way to safely do what we are trying, that will still inform us if we attempt to do something unsafe in another place later. Since this is an embedded system, RAM usage and flash usage are important to some degree.
Portability (compiler and architecture) is not a huge concern, this is just for one product. However, if a portable solution exists, it would be preferred.
Here is the a (very simplified) example of what we are currently doing:
#define MESSAGE_TYPE_A 0
#define MESSAGE_TYPE_B 1
typedef struct MessageA __attribute__((__packed__))
{
unsigned char messageType;
unsigned short data1;
unsigned int data2;
}
typedef struct MessageB __attribute__((__packed__))
{
unsigned char messageType;
unsigned char data3;
unsigned char data4;
}
// This gets filled by the comm system, assume from a UART interrupt or similar
unsigned char data[100];
// Assume this gets called once we receive a full message
void ProcessMessage()
{
MessageA* messageA;
unsigned char messageType = data[0];
if (messageType == MESSAGE_TYPE_A)
{
// Cast data to struct and attempt to read
messageA = (MessageA*)data; // Not safe since data may not be word aligned
// This may cause undefined behavior
if (messageA->data1 == 4) // warning would be here, when we use the data at the pointer
{
// Perform some action...
}
}
// ...
// process different types of messages
}
As has already been pointed out, casting pointers about is a dodgy practice.
Solution: use a union
struct message {
unsigned char messageType;
union {
struct {
int data1;
short data2;
} A;
struct {
char data1[5];
int data2;
} B;
} data;
};
void func (...) {
struct message msg;
getMessage (&msg);
switch (msg.messageType) {
case TYPEA:
doStuff (msg.data.A.data1);
break;
case TYPEB:
doOtherStuff (msg.data.B.data1);
break;
}
}
By this means the compiler knows you're accessing the same data via different means, and the warnings and Bad Things will go away.
Of coure, you'll need to make sure the structure alignment and packing matches your message format. Beware endian issues and such if the machine on the other end of the link doesn't match.
Type punning through cast of types different than char *
or a pointer to a signed/unsigned variant of char
is not strictly conforming as it violates C aliasing rules (and sometimes alignment rules if no care is given).
However, gcc
permits type punning through union types. Manpage of gcc
explicitly documents it:
The practice of reading from a different union member than the one most recently written to (called "type-punning") is common. Even with -fstrict-aliasing, type-punning is allowed, provided the memory is accessed through the union type.
To disable optimizations related to aliasing rules with gcc
(and thus allow the program to break C aliasing rules), the program can be compiled with: -fno-strict-aliasing
. Note that with this option enabled, the program is no longer strictly conforming, but you said portability is not a concern. For information, the Linux kernel is compiled with this option.
GCC has a -fno-strict-aliasing
flag that will disable strict-aliasing-based optimizations and make your code safe.
If you're really looking for a way to "fix" it, you have to rethink the way your code works. You can't just overlay the structure the way you're trying, so you need to do something like this:
MessageA messageA;
messageA.messageType = data[0];
// Watch out - endianness and `sizeof(short)` dependent!
messageA.data1 = (data[1] << 8) + data[2];
// Watch out - endianness and `sizeof(int)` dependent!
messageA.data2 = (data[3] << 24) + (data[4] << 16)
+ (data[5] << 8) + data[6];
This method will let you avoid packing your structure, which might also improve its performance characteristics elsewhere in your code. Alternately:
MessageA messageA;
memcpy(&messageA, data, sizeof messageA);
Will do it with your packed structures. You would do the reverse operations to translate the structures back into a flat buffer if necessary.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With