Puzzled by different result from "same" type cast, float to int

Question

If I assign a value to a floating point computation to a variable first, then assign that to an unsigned int with implicit type casting, I get one answer. But if I assign the same computation directly to the unsigned int, again with implicit type casting, I get a different answer.

Below is sample code I compiled and ran to demonstrate:

#include <iostream>



int
main( int argc, char** argv )
{
    float   payloadInTons = 6550.3;


    //  Above, payloadInTons is given a value.
    //  Below, two different ways are used to type cast that same value,
    //  but the results do not match.
    float tempVal = payloadInTons * 10.0;
    unsigned int right = tempVal;
    std::cout << "    right = " << right << std::endl;


    unsigned int rawPayloadN = payloadInTons * 10.0;
    std::cout << "    wrong = " << rawPayloadN << std::endl;


    return 0;
}

Does anyone have insight into why "right" is right, and "wrong" is wrong?

By the way, I am using gcc 4.8.2 on Ubuntu 14.04 LTS, if it matters.

Quentin · Accepted Answer

You are using double literals. With proper float literals, everything's fine.

int
main( int argc, char** argv )
{
    float   payloadInTons = 6550.3f;
    float tempVal = payloadInTons * 10.0f;

    unsigned int right = tempVal;
    std::cout << "     right = " << right << std::endl;

    unsigned int rawPayloadN = payloadInTons * 10.0f;
    std::cout << "also right = " << rawPayloadN << std::endl;


    return 0;
}

Output :

     right = 65503
also right = 65503

chux - Reinstate Monica · Answer

After accept answer

This is not a double vs. float issue. It is a binary floating-point and conversion to int/unsigned issue.

Typical float uses binary32 representation with does not give exact representation of values like 6550.3.

float payloadInTons = 6550.3;
// payloadInTons has the exact value of `6550.2998046875`.

Multiplying by 10.0, below, insures the calculation is done with at least double precision with an exact result of 65502.998046875. The product is then converted back to float. The double value is not exactly representable in float and so is rounded to the best float with an exact value of 65503.0. Then tempVal converts right as desired with a value of 65503.

float tempVal = payloadInTons * 10.0;
unsigned int right = tempVal;

Multiplying by 10.0, below, insures the calculation is done with at least double precision with an exact result of 65502.998046875 just as before. This time, the value is converted directly to unsigned rawPayloadN with the undesired with a value of 65502. This is because the value in truncated and not rounded.

unsigned int rawPayloadN = payloadInTons * 10.0;

The first “worked” because of the conversion was double to float to unsigned. This involves 2 conversions with is usually bad. In this case, 2 wrongs made a right.

Solution

Had code tried float payloadInTons = 6550.29931640625; (the next smallest float number) both result would have been 65502.

The "right” way to convert a floating point value to some integer type is often to round the result and then perform the type conversion.

float tempVal = payloadInTons * 10.0;
unsigned int right = roundf(tempVal);

Note: This entire issue is complication by the value of FLT_EVAL_METHOD. If user’s value is non-zero, floating point calculation may occur at higher precision than expected.

printf("FLT_EVAL_METHOD %d
", (int) FLT_EVAL_METHOD);

Puzzled by different result from "same" type cast, float to int

Tags:

c++

type-conversion

floating-point

gcc

donjuedo

2 Answers

Quentin

chux - Reinstate Monica

Recent Activity

Donate For Us

Puzzled by different result from "same" type cast, float to int

Tags:

c++

type-conversion

floating-point

gcc

donjuedo

2 Answers

Quentin

chux - Reinstate Monica

Related questions

Recent Activity

Donate For Us