Floating point limits code not producing correct results

Question

I am racking my brain trying to figure out why this code does not get the right result. I am looking for the hexadecimal representations of the floating point positive and negative overflow/underflow levels. The code is based off this site and a Wikipedia entry:

7f7f ffff ≈ 3.4028234 × 10³⁸ (max single precision) -- from wikipedia entry, corresponds to positive overflow

Here's the code:

#include <iostream>
#include <cstdio>
#include <cstdlib>
#include <cmath>

using namespace std;

int main(void) {

    float two = 2;
    float twentyThree = 23;
    float one27 = 127;
    float one49 = 149;


    float posOverflow, negOverflow, posUnderflow, negUnderflow;

    posOverflow = two - (pow(two, -twentyThree) * pow(two, one27));
    negOverflow = -(two - (pow(two, one27) * pow(two, one27)));


    negUnderflow = -pow(two, -one49);
    posUnderflow = pow(two, -one49);


    cout << "Positive overflow occurs when value greater than: " << hex << *(int*)&posOverflow << endl;


    cout << "Neg overflow occurs when value less than: " << hex << *(int*)&negOverflow << endl;


    cout << "Positive underflow occurs when value greater than: " << hex << *(int*)&posUnderflow << endl;


    cout << "Neg overflow occurs when value greater than: " << hex << *(int*)&negUnderflow << endl;

}

The output is:

Positive overflow occurs when value greater than: f3800000 Neg overflow occurs when value less than: 7f800000 Positive underflow occurs when value greater than: 1 Neg overflow occurs when value greater than: 80000001

To get the hexadecimal representation of the floating point, I am using a method described here:

Why isn't the code working? I know it'll work if positive overflow = 7f7f ffff.

John Calsbeek · Accepted Answer

Your expression for the highest representable positive float is wrong. The page you linked uses (2-pow(2, -23)) * pow(2, 127), and you have 2 - (pow(2, -23) * pow(2, 127)). Similarly for the smallest representable negative float.

Your underflow expressions look correct, however, and so do the hexadecimal outputs for them.

Note that posOverflow and negOverflow are simply +FLT_MAX and -FLT_MAX. But note that your posUnderflow and negUnderflow are actually smaller than FLT_MIN(because they are denormal, and FLT_MIN is the smallest positive normal float).

Floating point limits code not producing correct results

Tags:

c++

floating-point

rounding

Babbu Maan

1 Answers

John Calsbeek

Recent Activity

Donate For Us

Floating point limits code not producing correct results

Tags:

c++

floating-point

rounding

Babbu Maan

1 Answers

John Calsbeek

Related questions

Recent Activity

Donate For Us