Am I right that any arithmetic operation on any floating numbers is unambiguously defined by IEEE floating point standard? If yes, just for curiosity, what is (+0)+(-0)
? And is there a way to check such things in practice, in C++ or other commonly used programming language?
The IEEE-754 standard describes floating-point formats, a way to represent real numbers in hardware. There are at least five internal formats for floating-point numbers that are representable in hardware targeted by the MSVC compiler. The compiler only uses two of them.
IEEE Standard 754 floating point is the most common representation today for real numbers on computers, including Intel-based PC's, Macs, and most Unix platforms. This is as simple as the name. 0 represents a positive number while 1 represents a negative number.
For your case: When all bits (sign, exponent, mantissa) are zero the floating point value represents also zero, as defined by IEEE 754. More specifically, that value is a "positive zero", also written as +0.
However, in computing, some number representations allow for the existence of two zeros, often denoted by −0 (negative zero) and +0 (positive zero), regarded as equal by the numerical comparison operations but with possible different behaviors in particular operations.
The IEEE 754 rules of arithmetic for signed zeros state that +0.0 + -0.0
depends on the rounding mode. In the default rounding mode, it will be +0.0
. When rounding towards -∞, it will be -0.0
.
You can check this in C++ like so:
#include <iostream>
int main() {
std::cout << "+0.0 + +0.0 == " << +0.0 + +0.0 << std::endl;
std::cout << "+0.0 + -0.0 == " << +0.0 + -0.0 << std::endl;
std::cout << "-0.0 + +0.0 == " << -0.0 + +0.0 << std::endl;
std::cout << "-0.0 + -0.0 == " << -0.0 + -0.0 << std::endl;
return 0;
}
Output:
+0.0 + +0.0 == 0
+0.0 + -0.0 == 0
-0.0 + +0.0 == 0
-0.0 + -0.0 == -0
My answer deals with IEEE 754:2008, which is the current version of the standard.
Section 4.3 deals with the rounding of values when performing arithmetic operations in order to fit the bits into the mantissa.
4.3 Rounding-direction attributes
Rounding takes a number regarded as infinitely precise and, if necessary, modifies it to fit in the destination’s format while signaling the inexact exception, underflow, or overflow when appropriate (see 7). Except where stated otherwise, every operation shall be performed as if it first produced an intermediate result correct to infinite precision and with unbounded range, and then rounded that result according to one of the attributes in this clause.
The rounding-direction attribute affects all computational operations that might be inexact. Inexact numeric floating-point results always have the same sign as the unrounded result.
The rounding-direction attribute affects the signs of exact zero sums (see 6.3), and also affects the thresholds beyond which overflow and underflow are signaled.
Section 6.3 prescribes the value of the sign bit when performing arithmetic with special values (NaN, infinities, +0, -0).
6.3 The sign bit
When the sum of two operands with opposite signs (or the difference of two operands with like signs) is exactly zero, the sign of that sum (or difference) shall be +0 in all rounding-direction attributes except
roundTowardNegative
; under that attribute, the sign of an exact zero sum (or difference) shall be −0.However, x + x = x − (−x) retains the same sign as x even when x is zero.
(emphasis mine)
In other words, (+0) + (-0) = +0 except when the rounding mode is roundTowardNegative
, in which case it is (+0) + (-0) = -0.
According to §7.7.4 of the C# Language Specification (emphasis mine):
- Floating-point addition:
float operator +(float x, float y);
double operator +(double x, double y);
The sum is computed according to the rules of IEEE 754 arithmetic. The following table lists the results of all possible combinations of nonzero finite values, zeros, infinities, and NaN's. In the table, x and y are nonzero finite values, and z is the result of x + y. If x and y have the same magnitude but opposite signs, z is positive zero. If x + y is too large to represent in the destination type, z is an infinity with the same sign as x + y.
+ • x +0 -0 +∞ -∞ NaN
•••••••••••••••••••••••••••••••••••••••••••••
y • z y y +∞ -∞ NaN
+0 • x +0 +0 +∞ -∞ NaN
-0 • x +0 -0 +∞ -∞ NaN
+∞ • +∞ +∞ +∞ +∞ NaN NaN
-∞ • -∞ -∞ -∞ NaN -∞ NaN
NaN • NaN NaN NaN NaN NaN NaN
(+0) + (-0) in C#:
In other words, based on the specification, the addition of two zeros only results in negative zero if both are negative zero. Therefore, the answer to the original question
What is (+0)+(-0) by IEEE floating point standard?
is +0.
Rounding modes in C#:
In case anyone is interested in changing the rounding mode in C#, in "Is there an C# equivalent of c++ fesetround()
function?", Hans Passant states:
Never tinker with the FPU control word in C#. It is the worst possible global variable you can imagine. With the standard misery that globals cause, your changes cannot last and will arbitrarily disappear. The internal exception handling code in the CLR resets it when it processes an exception.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With