GCC 4.8.0 compiling in/for 32-bit.
I find the behavior of cases 2 and 6 to be confusing:
int16_t s16 = 0;
double dbl = 0.0;
s16 = (int16_t)((double)32767.0); // 1: s16 = 32767
s16 = (int16_t)((double)32768.0); // 2: s16 = 32767
s16 = (int16_t)((double)-32768.0); // 3: s16 = -32768
s16 = (int16_t)((double)-32769.0); // 4: s16 = -32768
dbl = 32767.0;
s16 = (int16_t)dbl; // 5: s16 = 32767
dbl = 32768.0;
s16 = (int16_t)dbl; // 6: s16 = -32768
dbl = -32768.0;
s16 = (int16_t)dbl; // 7: s16 = -32768
dbl = -32769.0;
s16 = (int16_t)dbl; // 8: s16 = -32768
I realize it's implementation defined, but consistency would still be nice. Can anyone explain what's going on?
The behaviour is not implementation-defined, it's undefined, per 6.3.1.4 (1):
If the value of the integral part cannot be represented by the integer type, the behavior is undefined.61)
61) The remaindering operation performed when a value of integer type is converted to unsigned type need not be performed when a value of real floating type is converted to unsigned type. Thus, the range of portable real floating values is
(−1, Utype_MAX+1)
.
The paragraph was identical in C99, just the footnote had a different number (50).
For undefined behaviour, it is not uncommon that the behaviour for expressions evaluated at compile time is different than runtime-evalued, for example
1 << width_of_type
is often evaluated to 0 if the shift distance is given as a constant expression, and to 1 if it is a runtime value.
The reasoning that leads to the different behaviours for the same code is, as far as I gathered, that since undefined behaviour is a licence for the compiler to produce anything, it may as well do the simplest and/or fastest thing, and the simplest/fastest thing during compilation can be different from the simplest/fastest thing during runtime.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With