Let's say I have two arithmetic types, an integer one, I
, and a floating point one, F
. I also assume that std::numeric_limits<I>::max()
is smaller than std::numeric_limits<F>::max()
.
Now, let's say I have a positive integer value i
. Because the representable range of F
is larger than I
, F(i)
should always be defined behavior.
However, if I have a floating point value f
such that f == F(i)
, is I(f)
well defined? In other words, is I(F(i))
always defined behavior?
Relevant section from the C++14 standard:
4.9 Floating-integral conversions [conv.fpint]
- A prvalue of a floating point type can be converted to a prvalue of an integer type. The conversion truncates; that is, the fractional part is discarded. The behavior is undefined if the truncated value cannot be represented in the destination type. [ Note: If the destination type is
bool
, see 4.12. — end note ]- A prvalue of an integer type or of an unscoped enumeration type can be converted to a prvalue of a floating point type. The result is exact if possible. If the value being converted is in the range of values that can be represented but the value cannot be represented exactly, it is an implementation-defined choice of either the next lower or higher representable value. [ Note: Loss of precision occurs if the integral value cannot be represented exactly as a value of the floating type. — end note ] If the value being converted is outside the range of values that can be represented, the behavior is undefined. If the source type is
bool
, the valuefalse
is converted to zero and the valuetrue
is converted to one.
Rounding is used when the exact result of a floating-point operation (or a conversion to floating-point format) would need more digits than there are digits in the significand.
Almost every language has a floating-point datatype; computers from PC's to supercomputers have floating-point accelerators; most compilers will be called upon to compile floating-point algorithms from time to time; and virtually every operating system must respond to floating-point exceptions such as overflow.
The reason it is impossible to represent some decimal numbers this way is that both the exponent and the mantissa must be integers. In other words, all floats must be an integer multiplied by an integer power of 2. 9.2 may be simply 92/10 , but 10 cannot be expressed as 2n if n is limited to integer values.
It's a problem caused when the internal representation of floating-point numbers, which uses a fixed number of binary digits to represent a decimal number. It is difficult to represent some decimal number in binary, so in many cases, it leads to small roundoff errors.
However, if I have a floating point value
f
such thatf == F(i)
, isI(f)
well defined? In other words, isI(F(i))
always defined behavior?
No.
Suppose that I
is a signed two's complement 32 bit integer type, F
is a 32 bit single precision floating point type, and i
is the maximum positive integer. This is within the range of the floating point type, but it cannot be represented exactly as a floating point number. Some of those 32 bits are used for the exponent.
Instead, the conversion from integer to floating point is implementation dependent, but typically is done by rounding to the closest representable value. That rounded value is one beyond the range of the integer type. The conversion back to integer fails (better said, it's undefined behavior).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With