Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is round-trip through floating point always defined behavior if floating point range is bigger?

Let's say I have two arithmetic types, an integer one, I, and a floating point one, F. I also assume that std::numeric_limits<I>::max() is smaller than std::numeric_limits<F>::max().

Now, let's say I have a positive integer value i. Because the representable range of F is larger than I, F(i) should always be defined behavior.

However, if I have a floating point value f such that f == F(i), is I(f) well defined? In other words, is I(F(i)) always defined behavior?


Relevant section from the C++14 standard:

4.9 Floating-integral conversions [conv.fpint]

  1. A prvalue of a floating point type can be converted to a prvalue of an integer type. The conversion truncates; that is, the fractional part is discarded. The behavior is undefined if the truncated value cannot be represented in the destination type. [ Note: If the destination type is bool, see 4.12. — end note ]
  2. A prvalue of an integer type or of an unscoped enumeration type can be converted to a prvalue of a floating point type. The result is exact if possible. If the value being converted is in the range of values that can be represented but the value cannot be represented exactly, it is an implementation-defined choice of either the next lower or higher representable value. [ Note: Loss of precision occurs if the integral value cannot be represented exactly as a value of the floating type. — end note ] If the value being converted is outside the range of values that can be represented, the behavior is undefined. If the source type is bool, the value false is converted to zero and the value true is converted to one.
like image 796
orlp Avatar asked Apr 29 '15 00:04

orlp


People also ask

What is floating-point rounding?

Rounding is used when the exact result of a floating-point operation (or a conversion to floating-point format) would need more digits than there are digits in the significand.

What every scientist should know about floating-point arithmetic?

Almost every language has a floating-point datatype; computers from PC's to supercomputers have floating-point accelerators; most compilers will be called upon to compile floating-point algorithms from time to time; and virtually every operating system must respond to floating-point exceptions such as overflow.

Why is a floating point number more difficult to represent and process than integer?

The reason it is impossible to represent some decimal numbers this way is that both the exponent and the mantissa must be integers. In other words, all floats must be an integer multiplied by an integer power of 2. 9.2 may be simply 92/10 , but 10 cannot be expressed as 2n if n is limited to integer values.

What causes floating-point error?

It's a problem caused when the internal representation of floating-point numbers, which uses a fixed number of binary digits to represent a decimal number. It is difficult to represent some decimal number in binary, so in many cases, it leads to small roundoff errors.


1 Answers

However, if I have a floating point value f such that f == F(i), is I(f) well defined? In other words, is I(F(i)) always defined behavior?

No.

Suppose that I is a signed two's complement 32 bit integer type, F is a 32 bit single precision floating point type, and i is the maximum positive integer. This is within the range of the floating point type, but it cannot be represented exactly as a floating point number. Some of those 32 bits are used for the exponent.

Instead, the conversion from integer to floating point is implementation dependent, but typically is done by rounding to the closest representable value. That rounded value is one beyond the range of the integer type. The conversion back to integer fails (better said, it's undefined behavior).

like image 130
David Hammen Avatar answered Sep 24 '22 00:09

David Hammen