If I assign a value to a floating point computation to a variable first, then assign that to an unsigned int with implicit type casting, I get one answer. But if I assign the same computation directly to the unsigned int, again with implicit type casting, I get a different answer.
Below is sample code I compiled and ran to demonstrate:
#include <iostream>
int
main( int argc, char** argv )
{
float payloadInTons = 6550.3;
// Above, payloadInTons is given a value.
// Below, two different ways are used to type cast that same value,
// but the results do not match.
float tempVal = payloadInTons * 10.0;
unsigned int right = tempVal;
std::cout << " right = " << right << std::endl;
unsigned int rawPayloadN = payloadInTons * 10.0;
std::cout << " wrong = " << rawPayloadN << std::endl;
return 0;
}
Does anyone have insight into why "right" is right, and "wrong" is wrong?
By the way, I am using gcc 4.8.2 on Ubuntu 14.04 LTS, if it matters.
You are using double
literals. With proper float
literals, everything's fine.
int
main( int argc, char** argv )
{
float payloadInTons = 6550.3f;
float tempVal = payloadInTons * 10.0f;
unsigned int right = tempVal;
std::cout << " right = " << right << std::endl;
unsigned int rawPayloadN = payloadInTons * 10.0f;
std::cout << "also right = " << rawPayloadN << std::endl;
return 0;
}
Output :
right = 65503
also right = 65503
After accept answer
This is not a double
vs. float
issue. It is a binary floating-point and conversion to int/unsigned
issue.
Typical float
uses binary32 representation with does not give exact representation of values like 6550.3.
float payloadInTons = 6550.3;
// payloadInTons has the exact value of `6550.2998046875`.
Multiplying by 10.0
, below, insures the calculation is done with at least double
precision with an exact result of 65502.998046875
. The product is then converted back to float
. The double
value is not exactly representable in float
and so is rounded to the best float
with an exact value of 65503.0
. Then tempVal
converts right
as desired with a value of 65503
.
float tempVal = payloadInTons * 10.0;
unsigned int right = tempVal;
Multiplying by 10.0
, below, insures the calculation is done with at least double
precision with an exact result of 65502.998046875
just as before. This time, the value is converted directly to unsigned rawPayloadN
with the undesired with a value of 65502
. This is because the value in truncated and not rounded.
unsigned int rawPayloadN = payloadInTons * 10.0;
The first “worked” because of the conversion was double
to float
to unsigned
. This involves 2 conversions with is usually bad. In this case, 2 wrongs made a right.
Solution
Had code tried float payloadInTons = 6550.29931640625;
(the next smallest float
number) both result would have been 65502
.
The "right” way to convert a floating point value to some integer type is often to round the result and then perform the type conversion.
float tempVal = payloadInTons * 10.0;
unsigned int right = roundf(tempVal);
Note: This entire issue is complication by the value of FLT_EVAL_METHOD
. If user’s value is non-zero, floating point calculation may occur at higher precision than expected.
printf("FLT_EVAL_METHOD %d\n", (int) FLT_EVAL_METHOD);
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With