I need to perform a simple multiplication of 400 * 256.3. The result is 102520. Straight forward and simple. But to implement this multiplication in C++ (or C) is a little tricky and confusing to me.
I understand floating point number is not represented as it is in computer. I wrote the code to illustrate the situation. Output is attached too.
So, if I do the multiplication using float type variable, I am subjected to rounding error. Using double type variable would have avoided the problem. But let's say I have a very limited resource on the embedded system and I have to optimize the variable type to the very best I could, how can I perform the multiplication using float type variable and not susceptible to rounding error?
I knew the floating point math done by computer is not broken at all. But I am curious for best practice to perform floating point math. 256.3 is just a value for illustration. I would not know what floating point value I will get during runtime. But it is for sure, a floating point value.
int main()
{
//perform 400 * 256.3
//result should be 102520
float floatResult = 0.00f;
int intResult = 0;
double doubleResult = 0.00;
//float = int * float
floatResult = 400 * 256.3f;
printf("400 * 256.3f = (float)->%f\n", floatResult);
//float = float * float
floatResult = 400.00f * 256.3f;
printf("400.00f * 256.3f = (float)->%f\n", floatResult);
printf("\n");
//int = int * float
intResult = 400 * 256.3f;
printf("400 * 256.3f = (int)->%d\n", intResult);
//int = float * float;
intResult = 400.00f * 256.3f;
printf("400.00f * 256.3f = (int)->%d\n", intResult);
printf("\n");
//double = double * double
doubleResult = 400.00 * 256.3;
printf("400.00 * 256.3 = (double)->%f\n", doubleResult);
//int = double * double;
intResult = 400.00 * 256.3;
printf("400.00 * 256.3 = (int)->%d\n", intResult);
printf("\n");
//double = int * double
doubleResult = 400 * 256.3;
printf("400 * 256.3 = (double)->%f\n", doubleResult);
//int = int * double
intResult = 400 * 256.3;
printf("400 * 256.3 = (int)->%d\n", intResult);
printf("\n");
//will double give me rounding error?
if (((400.00 * 256.3) - 102520) != 0) {
printf("Double give me rounding error!\n");
}
//will float give me rounding error?
if (((400.00f * 256.3f) - 102520) != 0) {
printf("Float give me rounding error!\n");
}
return 0;
}
Output from the code above
If you have a fixed number of decimal digits (1 in the case of 256.3
) as well as a bounded range of the results, you can use integer multiplication, and adjust for the shift in decimal digits through integer division:
int result = (400 * 2563) / 10;
Rounding errors are inherent to floating point arithmetics, except for a few cases where all operands can be represented exactly. Whether you choose float
or double
just influences when the error occurs, not if.
First of all, understand that type double
has all the same problems as type float
. Neither type has infinite precision, so both types are susceptible to precision loss and other problems.
As to what you can do: there are many different problems that come up, depending on what you're doing, and many techniques to overcome them. Many, many words have been written on these techniques; I suggest doing a web search on "avoiding floating point error". But the basic points are:
See also https://www.eskimo.com/~scs/cclass/handouts/sciprog.html .
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With