Am I right that any arithmetic operation on any floating numbers is unambiguously defined by IEEE floating point standard? If yes, just for curiosity, what is <code>(+0)+(-0)</code>? And is there a way to check such things in practice, in C++ or other commonly used programming language?

The IEEE 754 rules of arithmetic for signed zeros state that <code>+0.0 + -0.0</code> depends on the rounding mode. In the default rounding mode, it will be <code>+0.0</code>. When rounding towards -∞, it will be <code>-0.0</code>. You can check this in C++ like so: <pre class="prettyprint lang-cpp prettyprint-override"><code>#include <iostream> int main() { std::cout << "+0.0 + +0.0 == " << +0.0 + +0.0 << std::endl; std::cout << "+0.0 + -0.0 == " << +0.0 + -0.0 << std::endl; std::cout << "-0.0 + +0.0 == " << -0.0 + +0.0 << std::endl; std::cout << "-0.0 + -0.0 == " << -0.0 + -0.0 << std::endl; return 0; } </code></pre> Output: <pre class="prettyprint"><code>+0.0 + +0.0 == 0 +0.0 + -0.0 == 0 -0.0 + +0.0 == 0 -0.0 + -0.0 == -0 </code></pre>

My answer deals with IEEE 754:2008, which is the current version of the standard. <h3>In the IEEE 754:2008 standard:</h3> Section 4.3 deals with the rounding of values when performing arithmetic operations in order to fit the bits into the mantissa. <blockquote> 4.3 Rounding-direction attributes Rounding takes a number regarded as infinitely precise and, if necessary, modifies it to fit in the destination’s format while signaling the inexact exception, underflow, or overflow when appropriate (see 7). Except where stated otherwise, every operation shall be performed as if it first produced an intermediate result correct to infinite precision and with unbounded range, and then rounded that result according to one of the attributes in this clause. The rounding-direction attribute affects all computational operations that might be inexact. Inexact numeric floating-point results always have the same sign as the unrounded result. The rounding-direction attribute affects the signs of exact zero sums (see 6.3), and also affects the thresholds beyond which overflow and underflow are signaled. </blockquote> <hr> Section 6.3 prescribes the value of the sign bit when performing arithmetic with special values (NaN, infinities, +0, -0). <blockquote> 6.3 The sign bit When the sum of two operands with opposite signs (or the difference of two operands with like signs) is exactly zero, the sign of that sum (or difference) shall be +0 in all rounding-direction attributes except <code>roundTowardNegative</code>; under that attribute, the sign of an exact zero sum (or difference) shall be −0. However, x + x = x − (−x) retains the same sign as x even when x is zero. </blockquote> (emphasis mine) In other words, (+0) + (-0) = +0 except when the rounding mode is <code>roundTowardNegative</code>, in which case it is (+0) + (-0) = -0. <hr> <hr> <hr> <h3>In the context of C#:</h3> According to §7.7.4 of the C# Language Specification (emphasis mine): <blockquote> <ul> <li>Floating-point addition:</li> </ul> <code>float operator +(float x, float y);</code> <code>double operator +(double x, double y);</code> The sum is computed according to the rules of IEEE 754 arithmetic. The following table lists the results of all possible combinations of nonzero finite values, zeros, infinities, and NaN's. In the table, x and y are nonzero finite values, and z is the result of x + y. If x and y have the same magnitude but opposite signs, z is positive zero. If x + y is too large to represent in the destination type, z is an infinity with the same sign as x + y. </blockquote> <pre class="prettyprint"><code> + • x +0 -0 +∞ -∞ NaN ••••••••••••••••••••••••••••••••••••••••••••• y • z y y +∞ -∞ NaN +0 • x +0 +0 +∞ -∞ NaN -0 • x +0 -0 +∞ -∞ NaN +∞ • +∞ +∞ +∞ +∞ NaN NaN -∞ • -∞ -∞ -∞ NaN -∞ NaN NaN • NaN NaN NaN NaN NaN NaN </code></pre> <hr> (+0) + (-0) in C#: In other words, based on the specification, the addition of two zeros only results in negative zero if both are negative zero. Therefore, the answer to the original question <blockquote> What is (+0)+(-0) by IEEE floating point standard? </blockquote> is +0. <hr> Rounding modes in C#: In case anyone is interested in changing the rounding mode in C#, in "Is there an C# equivalent of c++ <code>fesetround()</code> function?", Hans Passant states: <blockquote> Never tinker with the FPU control word in C#. It is the worst possible global variable you can imagine. With the standard misery that globals cause, your changes cannot last and will arbitrarily disappear. The internal exception handling code in the CLR resets it when it processes an exception. </blockquote>

What is (+0)+(-0) by IEEE floating point standard?

Tags:

c++

floating-point

ieee-754

Am I right that any arithmetic operation on any floating numbers is unambiguously defined by IEEE floating point standard? If yes, just for curiosity, what is (+0)+(-0)? And is there a way to check such things in practice, in C++ or other commonly used programming language?

307

asked Mar 09 '15 19:03

se0808

2 Answers

The IEEE 754 rules of arithmetic for signed zeros state that +0.0 + -0.0 depends on the rounding mode. In the default rounding mode, it will be +0.0. When rounding towards -∞, it will be -0.0.

You can check this in C++ like so:

#include <iostream>

int main() {
    std::cout << "+0.0 + +0.0 == " << +0.0 + +0.0 << std::endl;
    std::cout << "+0.0 + -0.0 == " << +0.0 + -0.0 << std::endl;
    std::cout << "-0.0 + +0.0 == " << -0.0 + +0.0 << std::endl;
    std::cout << "-0.0 + -0.0 == " << -0.0 + -0.0 << std::endl;
    return 0;
}

Output:

+0.0 + +0.0 == 0
+0.0 + -0.0 == 0
-0.0 + +0.0 == 0
-0.0 + -0.0 == -0

answered Sep 30 '22 13:09

Tavian Barnes

My answer deals with IEEE 754:2008, which is the current version of the standard.

In the IEEE 754:2008 standard:

Section 4.3 deals with the rounding of values when performing arithmetic operations in order to fit the bits into the mantissa.

4.3 Rounding-direction attributes

Rounding takes a number regarded as infinitely precise and, if necessary, modifies it to fit in the destination’s format while signaling the inexact exception, underflow, or overflow when appropriate (see 7). Except where stated otherwise, every operation shall be performed as if it first produced an intermediate result correct to infinite precision and with unbounded range, and then rounded that result according to one of the attributes in this clause.

The rounding-direction attribute affects all computational operations that might be inexact. Inexact numeric floating-point results always have the same sign as the unrounded result.

The rounding-direction attribute affects the signs of exact zero sums (see 6.3), and also affects the thresholds beyond which overflow and underflow are signaled.

Section 6.3 prescribes the value of the sign bit when performing arithmetic with special values (NaN, infinities, +0, -0).

6.3 The sign bit

When the sum of two operands with opposite signs (or the difference of two operands with like signs) is exactly zero, the sign of that sum (or difference) shall be +0 in all rounding-direction attributes except roundTowardNegative; under that attribute, the sign of an exact zero sum (or difference) shall be −0.

However, x + x = x − (−x) retains the same sign as x even when x is zero.

(emphasis mine)

In other words, (+0) + (-0) = +0 except when the rounding mode is roundTowardNegative, in which case it is (+0) + (-0) = -0.

In the context of C#:

According to §7.7.4 of the C# Language Specification (emphasis mine):

Floating-point addition:

float operator +(float x, float y);

double operator +(double x, double y);

The sum is computed according to the rules of IEEE 754 arithmetic. The following table lists the results of all possible combinations of nonzero finite values, zeros, infinities, and NaN's. In the table, x and y are nonzero finite values, and z is the result of x + y. If x and y have the same magnitude but opposite signs, z is positive zero. If x + y is too large to represent in the destination type, z is an infinity with the same sign as x + y.

 +  •  x      +0     -0     +∞     -∞    NaN
•••••••••••••••••••••••••••••••••••••••••••••
y   •  z      y      y      +∞     -∞    NaN
+0  •  x      +0     +0     +∞     -∞    NaN
-0  •  x      +0     -0     +∞     -∞    NaN
+∞  •  +∞     +∞     +∞     +∞     NaN   NaN
-∞  •  -∞     -∞     -∞     NaN    -∞    NaN
NaN •  NaN    NaN    NaN    NaN    NaN   NaN

(+0) + (-0) in C#:

In other words, based on the specification, the addition of two zeros only results in negative zero if both are negative zero. Therefore, the answer to the original question

What is (+0)+(-0) by IEEE floating point standard?

is +0.

Rounding modes in C#:

In case anyone is interested in changing the rounding mode in C#, in "Is there an C# equivalent of c++ fesetround() function?", Hans Passant states:

Never tinker with the FPU control word in C#. It is the worst possible global variable you can imagine. With the standard misery that globals cause, your changes cannot last and will arbitrarily disappear. The internal exception handling code in the CLR resets it when it processes an exception.

answered Sep 30 '22 15:09

Wai Ha Lee

Related questions
                            
                                Why is address calculation for array element lengths divisible by powers of 2 more efficient?
                            
                                Asynchronously writing to a file in c++ unix
                            
                                C++ 'class' type redefinition
                            
                                Is comparing to zero faster than comparing to any other number?
                            
                                How to count the hamming distance of two short int?
                            
                                Using exprtk in a multithreaded program
                            
                                Efficient, or fast, size of the set intersection of two vectors
                            
                                Opengl Render To Texture With Partial Transparancy (Translucency) And Then Rendering That To The Screen
                            
                                Exposing C++ functions, that return pointer using Boost.Python
                            
                                Is there a way to late-initialize a member variable (a class) in C++?
                            
                                What is the sizeof std::array<char, N>? [duplicate]
                            
                                Is it safe to use std::prev(vector.begin()) or std::next(vector.begin(), -1) like some_container.rend() as reversed sentry?
                            
                                C++ cout side-effect sequencing
                            
                                why std::sort() requires static Compare function? [duplicate]
                            
                                Compiler does not deduce template parameters (map std::vector -> std::vector)
                            
                                Is it compiler bug or my bug when using boost::tribool in a conditional?
                            
                                enable conversion operator using SFINAE
                            
                                Creating unordered_set of unordered_set
                            
                                typedef and template parameter with same name
                            
                                Cython/Python/C++ - Inheritance: Passing Derived Class as Argument to Function expecting base class

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

What is (+0)+(-0) by IEEE floating point standard?

Tags:

c++

floating-point

ieee-754

se0808

People also ask

2 Answers

Tavian Barnes

In the IEEE 754:2008 standard:

In the context of C#:

Wai Ha Lee

Recent Activity

Donate For Us