Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How does DBL_MAX addition work?

Tags:

c++

c

double

Code

#include<stdio.h>
#include<limits.h>
#include<float.h>

int f( double x, double y, double z){
  return  (x+y)+z == x+(y+z);
}

int ff( long long x, long long y, long long z){
  return  (x+y)+z == x+(y+z);
}

int main()
{
    printf("%d\n",f(DBL_MAX,DBL_MAX,-DBL_MAX));     
    printf("%d\n",ff(LLONG_MAX,LLONG_MAX,-LLONG_MAX));
    return 0;
}

Output

0
1

I am unable to understand why both functions work differently. What is happening here?

like image 369
dazzieta Avatar asked Feb 05 '26 21:02

dazzieta


1 Answers

In the eyes of the C++ and the C standard, the integer version definitely and the floating point version potentially invoke Undefined Behavior because the results of the computation x + y is not representable in the type the arithmetic is performed on. So both functions may yield or even do anything.

However, many real world platforms offer additional guarantees for floating point operations and implement integers in a certain way that lets us explain the results you get.

Considering f, we note that many popular platforms implement floating point math as described in IEEE 754. Following the rules of that standard, we get for the LHS:

DBL_MAX + DBL_MAX = INF

and

INF - DBL_MAX = INF.

The RHS yields

DBL_MAX - DBL_MAX = 0

and

DBL_MAX + 0 = DBL_MAX

and thus LHS != RHS.

Moving on to ff: Many platforms perform signed integer computation in twos complement. Twos complement's addition is associative, so the comparison will yield true as long as optimizer does not change it to something that contradicts twos complement rules.

The latter is entirely possible (for example see this discussion), so you cannot rely on signed integer overflow doing what I explained above. However, it seems that it "was nice" in this case.


Note that this never applies to unsigned integer arithmetic. In C++, unsigned integers implement arithmetic modulo 2^NumBits where NumBits is the number of bits of the type. In this arithmetic, every integer can be represented by picking a representative of its equivalence class in [0, 2^NumBits - 1]. So this arithmetic can never overflow.

For those doubting that the floating point case is potential UB: N4140 5/4 [expr] says

If during the evaluation of an expression, the result is not mathematically defined or not in the range of representable values for its type, the behavior is undefined.

which is the case. The inf and NaN stuff is allowed, but not required in C++ and C floating point math. It is only required if std::numeric_limits::is_iec559<T> is true for floating point type in question. (Or in C, if it defines __STDC_IEC_559__ . Otherwise, the Annex F stuff need not apply.) If either of the iec indicators guarantees us IEEE semantics, the behavior is well defined to do what I described above.

like image 85
Baum mit Augen Avatar answered Feb 08 '26 10:02

Baum mit Augen



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!