In the excellent blog post What Every Programmer Should Know About Undefined Behavior, the section "Violating Type Rules" says:
It is undefined behavior to cast an int* to a float* and dereference it (accessing the "int" as if it were a "float"). C requires that these sorts of type conversions happen through memcpy: using pointer casts is not correct and undefined behavior results. The rules for this are quite nuanced and I don't want to go into the details here (there is an exception for char*, vectors have special properties, unions change things, etc).
I'd like to understand the rules in their full nuancedness. Where are they in the C++11 spec? Or failing that, the C spec (C90, C99, C11)?
In the C++11 spec linked from this Stack Overflow question, N3485, I'm looking in 5.2.10 "Reinterpret cast" but don't see language for an exception for char* or unions. So that's probably not the right place. So where is the right place?
The rule you're looking for is in §3.10/10 (in C++11):
If a program attempts to access the stored value of an object through a glvalue of other than one of the following types the behavior is undefined: — the dynamic type of the object,
— a cv-qualified version of the dynamic type of the object,
— a type similar (as defined in 4.4) to the dynamic type of the object,
— a type that is the signed or unsigned type corresponding to the dynamic type of the object, — a type that is the signed or unsigned type corresponding to a cv-qualified version of the dynamic type of the object,
— an aggregate or union type that includes one of the aforementioned types among its elements or nonstatic data members (including, recursively, an element or non-static data member of a subaggregate or contained union),
— a type that is a (possibly cv-qualified) base class type of the dynamic type of the object,
— a char or unsignedchar type.
There are different types (or motivations) for undefined behavior.
In the case of casting an int*
to float*
and then
dereferencing it, it is clear that the standard cannot define
it, since what might happen will depend on the architecture, and
the value of the int
. On the other hand, the quoted paragraph
is completely wrong—using memcpy
to do the conversion is
also undefined behavior, for largely the same reasons.
One of the motivations for undefined behavior is to
allow implementations to define it, in a manner that makes sense
for the target architecture, if such exists. This is such
a case. A compiler which intentionally causes it to fail is
defective. Of course, if we suppose 32 bit 2's complement
int
, and 32 bit IEEE float
, we may expect certain values of
the int
to correspond to trapping NaN, which will cause the program
to fail. This is part of the reason the behavior is
undefined; to allow such things to happen. But if we are
familiar with the low level details of the hardware,
it should work as expected, provided the compiler can see
the cast.
If it doesn't, this is a QoI problem with the compiler, and such
a compiler should be avoided for such types of work.
As hinted at above, this particular case, and in fact, in all
cases which involve type punning (writing to one member of
a union, and reading from another, for example), do pose
a problem, to which the standard has yet to find adequate
wording. The problem occurs because normally, the compiler is
allowed to assume that pointers to different types (except
character types) do not alias; that an int*
can never point to
the same object as a float*
. And proving that two pointers
cannot alias is important for optimization. A compiler that
breaks code where the pointer cast or the union is clearly visible is
just broken, even if the standard says it is undefined behavior.
A compiler that breaks code where all it sees are two pointers
to unrelated types is understandable, even in cases where the
standard says the behavior is well defined.
Using memcpy
avoids this problem by using two different
objects, which don't alias. It still encounters undefined
behavior because putting the bit pattern of an int
into
a float
, then accessing the float, doesn't have any defined
behavior. (Or vice-versa; I know of at least one machine where
copying the bits of a float
into an int
may result in an
illegal int
value.)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With