C++ unions vs. reinterpret_cast

Tags:

It appears from other StackOverflow questions and reading §9.5.1 of the ISO/IEC draft C++ standard standard that the use of unions to do a literal reinterpret_cast of data is undefined behavior.

Consider the code below. The goal is to take the integer value of 0xffff and literally interpret it as a series of bits in IEEE 754 floating point. (Binary convert shows visually how this is done.)

#include <iostream>
using namespace std;

union unionType {
    int myInt;
    float myFloat;
};

int main() {

    int i = 0xffff;

    unionType u;
    u.myInt = i;

    cout << "size of int    " << sizeof(int) << endl;
    cout << "size of float  " << sizeof(float) << endl;

    cout << "myInt          " << u.myInt << endl;
    cout << "myFloat        " << u.myFloat << endl;

    float theFloat = *reinterpret_cast<float*>(&i);
    cout << "theFloat       " << theFloat << endl;

    return 0;
}

The output of this code, using both GCC and clang compilers is expected.

size of int    4
size of float  4
myInt          65535
myFloat        9.18341e-41
theFloat       9.18341e-41

My question is, does the standard actually preclude the value of myFloat from being deterministic? Is the use of a reinterpret_cast better in any way to perform this type of conversion?

The standard states the following in §9.5.1:

In a union, at most one of the non-static data members can be active at any time, that is, the value of at most one of the non-static data members can be stored in a union at any time. [...] The size of a union is sufficient to contain the largest of its non-static data members. Each non-static data member is allocated as if it were the sole member of a struct. All non-static data members of a union object have the same address.

The last sentence, guaranteeing that all non-static members have the same address, seems to indicate the use of a union is guaranteed to be identical to the use of a reinterpret_cast, but the earlier statement about active data members seems to preclude this guarantee.

So which construct is more correct?

Edit: Using Intel's icpc compiler, the above code produces even more interesting results:

$ icpc union.cpp
$ ./a.out
size of int    4
size of float  4
myInt          65535
myFloat        0
theFloat       0

447

asked May 19 '13 16:05

kgraney

3 Answers

The reason it's undefined is because there's no guarantee what exactly the value representations of int and float are. The C++ standard doesn't say that a float is stored as an IEEE 754 single-precision floating point number. What exactly should the standard say about you treating an int object with value 0xffff as a float? It doesn't say anything other than the fact it is undefined.

Practically, however, this is the purpose of reinterpret_cast - to tell the compiler to ignore everything it knows about the types of objects and trust you that this int is actually a float. It's almost always used for machine-specific bit-level jiggery-pokery. The C++ standard just doesn't guarantee you anything once you do it. At that point, it's up to you to understand exactly what your compiler and machine do in this situation.

This is true for both the union and reinterpret_cast approaches. I suggest that reinterpret_cast is "better" for this task, since it makes the intent clearer. However, keeping your code well-defined is always the best approach.

192

answered Oct 01 '22 02:10

Joseph Mansfield

It's not undefined behavior. It's implementation defined behavior. The first does mean that bad things can happen. The other means that what will happen has to be defined by the implementation.

The reinterpret_cast violates the strict aliasing rule. So I do not think it will work reliably. The union trick is what people call type-punning and is usually allowed by compilers. The gcc folks document the behavior of the compiler: http://gcc.gnu.org/onlinedocs/gcc/Structures-unions-enumerations-and-bit_002dfields-implementation.html#Structures-unions-enumerations-and-bit_002dfields-implementation

I think this should work with icpc as well (but they do not appear to document how they implemented that). But when I looked the assembly, it looks like icc tries to cheat with float and use higher precision floating point stuff. Passing -fp-model source to the compiler fixed that. With that option, I get the same results as with gcc. I do not think you want to use this flag in general, this is just a test to verify my theory.

So for icpc, I think if you switch your code from int/float to long/double, type-punning will work on icpc as well.

answered Oct 01 '22 04:10

Guillaume

Undefined behavior does not mean bad things must happen. It means only that the language definition doesn't tell you what happens. This kind of type pun has been part of C and C++ programming since time immemorial (i.e., since 1969); it would take a particularly perverse implementor to write a compiler where this didn't work.

answered Oct 01 '22 04:10

Pete Becker

Related questions
                            
                                Networking Library in C++14
                            
                                When a float variable goes out of the float limits, what happens?
                            
                                x[0] == 1 constant expression in C++11 when x is const int[]?
                            
                                Is it legal C++ to declare main as extern "C"?
                            
                                Repeating Q_DISABLE_COPY in QObject derived classes
                            
                                Why can a static member function only be declared static inside the class definition and not also in its own definition?
                            
                                How to display pixels on screen directly from a raw array of RGB values faster than SetPixel()?
                            
                                Why not always use fpic (Position Independent Code)? [duplicate]
                            
                                Strange implicit conversions with the ternary operator
                            
                                Why does this rvalue reference bind to an lvalue?
                            
                                std::variant reflection. How can I tell which type of value std::variant is assigned?
                            
                                Profiling C++ in the presence of aggressive inlining?
                            
                                How to convert a dynamic dll to static lib?
                            
                                Is there an intelligent way to know the name of the library to link to at compile time? (Linux/Kubuntu)
                            
                                C++: type_info to distinguish types
                            
                                Remove an element from the middle of an std::heap
                            
                                C++ a singleton class with dll
                            
                                Set minimum version of boost in cmake
                            
                                What does "break when an exception is void" mean?
                            
                                Difference between pair of consts and const pair

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

C++ unions vs. reinterpret_cast

Tags:

c++

reinterpret-cast

unions