My program receives messages over the network. These messages are deserialized by some middleware (i.e. someone else's code which I cannot change). My program receives objects that look something like this:
struct Message { int msg_type; std::vector<uint8_t> payload; };
By examining msg_type
I can determine that the message payload is actually, for example, an array of uint16_t
values. I would like to read that array without an unnecessary copy.
My first thought was to do this:
const uint16_t* a = reinterpret_cast<uint16_t*>(msg.payload.data());
But then reading from a
would appear to violate the standard. Here is clause 3.10.10:
If a program attempts to access the stored value of an object through a glvalue of other than one of the following types the behavior is undefined:
- the dynamic type of the object,
- a cv-qualified version of the dynamic type of the object,
- a type similar (as defined in 4.4) to the dynamic type of the object,
- a type that is the signed or unsigned type corresponding to the dynamic type of the object,
- a type that is the signed or unsigned type corresponding to a cv-qualified version of the dynamic type of the object,
- an aggregate or union type that includes one of the aforementioned types among its elements or nonstatic data members (including, recursively, an element or non-static data member of a subaggregate or contained union),
- a type that is a (possibly cv-qualified) base class type of the dynamic type of the object,
- a
char
orunsigned char
type.
In this case, a
would be the glvalue and uint16_t*
does not appear to meet any of the listed criteria.
So how do I treat the payload as an array of uint16_t
values without invoking undefined behavior or performing an unnecessary copy?
The strict aliasing rule dictates that pointers are assumed not to alias if they point to fundamentally different types, except for char* and void* which can alias to any other data type.
In C, C++, and some other programming languages, the term aliasing refers to a situation where two different expressions or symbols refer to the same object.
If you are going to consume the values one by one then you can memcpy
to a uint16_t
, or write payload[0] + 0x100 * payload[1]
etc. , as to which behaviour you want. This will not be "inefficient".
If you have to call a function that only takes an array of uint16_t
, and you cannot change the struct that delivers Message
, then you are out of luck. In Standard C++ you'll have to make the copy.
If you are using gcc or clang, another option is to set -fno-strict-aliasing
while compiling the code in question.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With