Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

C++ Union Member Access And Undefined Behaviour

I am currently working on a project in which I am provided the following structre. My work is C++ but the project uses both C and C++. The same structure definition is used by both C and C++.

typedef struct PacketHeader {
    //Byte 0
    uint8_t  bRes                           :4;
    uint8_t  bEmpty                         :1;
    uint8_t  bWait                          :1;
    uint8_t  bErr                           :1;
    uint8_t  bEnable                        :1;
    //Byte 1
    uint8_t  bInst                          :4;
    uint8_t  bCount                         :3;
    uint8_t  bRres                          :1;
    //Bytes 2, 3
    union {
        uint16_t wId;    /* Needed for Endian swapping */
        struct{
            uint16_t wMake                  :4;
            uint16_t wMod                   :12;
        };
    };
} PacketHeader;

Depending on how instances of the structure are used, the required endianness of the structure can be big or little endian. As the first two bytes of the structure are each single bytes, these don't need altering when the endianness changes. Bytes 2 and 3, stored as a single uint16_t, are the only bytes which we need to swap to acheive the desired endianness. To acheive the endianness swap, we have been performing the following:

//Returns a constructed instance of PacketHeader with relevant fields set and the provided counter value
PacketHeader myHeader = mmt::BuildPacketHeader(count);

uint16_t packetIdFlipped;
//Swap positions of byte 2 and 3
packetIdFlipped = myHeader.wId << 8;
packetIdFlipped |= (uint16_t)myHeader.wId >> 8;

myHeader.wId = packetIdFlipped;

The function BuildPacketHeader(uint8_t) assigns values to the members wMake and wMod explicitly, and does not write to the member wId. My question is regarding the safety of reading from the member wId inside the returned instance of the structure.

Questions such as Accessing inactive union member and undefined behavior?, Purpose of Unions in C and C++, and Section 10.4 of the draft standard I have each mention the undefined behaviour arising from accessing an inactive member of a union in C++.

Paragraph 1 in Section 10.4 of the linked draft also contains the following note, though I'm not sure I understand all the terminology used:

[Note: One special guarantee is made in order to simplify the use of unions: If a standard-layout union contains several standard-layout structs that share a common initial sequence (10.3), and if a non-static datamember of an object of this standard-layout union type is active and is one of the standard-layout structs, itis permitted to inspect the common initial sequence of any of the standard-layout struct members; see 10.3.— end note]

Is reading myHeader.wId in the line packetIdFlipped = myHeader.wId << 8 undefined behaviour?

Is the unnamed struct the active member as it was the last member written to in the function call?

Or does the note mean it is safe to access the wId member, as it and the struct share a common type? (and is this what is meant by common initial sequence?)

Thanks in advance

like image 583
cprlkleg Avatar asked Mar 21 '19 13:03

cprlkleg


People also ask

Why does C have so much undefined behavior?

It exists because of the syntax rules of C where a variable can be declared without init value. Some compilers assign 0 to such variables and some just assign a mem pointer to the variable and leave just like that. if program does not initialize these variables it leads to undefined behavior.

How to define union in c++?

In C++17 and later, the std::variant class is a type-safe alternative for a union. A union is a user-defined type in which all members share the same memory location. This definition means that at any given time, a union can contain no more than one object from its list of members.

Can a c++ union have a constructor?

A union can have member functions (including constructors and destructors), but not virtual functions. A union cannot have base classes and cannot be used as a base class. A union cannot have non-static data members of reference types.

What is Type punning C++?

Type Punning In a systems language like C++ you often want to interpret a value of type A as a value of type B where A and B are completely unrelated types. This is called type punning. Take for example the ever popular Fast Inverse Square Root.


1 Answers

The function BuildPacketHeader(uint8_t) assigns values to the members wMake and wMod explicitly, and does not write to the member wId. My question is regarding the safety of reading from the member wId inside the returned instance of the structure.

Yes, it's UB. It does not mean it's not working, just that it may not work. You can use memcpy inside BuildPacketHeader to avoid that (see this and this).

like image 143
L.C. Avatar answered Sep 19 '22 11:09

L.C.