Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Float subtraction returns incorrect value

Tags:

c++

So I have a calculation whereby two floats that are components of vector objects are subtracted and then seem to return an incorrect result.

The code I'm attempting to use is:

cout << xresult.x << " " << vec1.x << endl;
float xpart1 = xresult.x - vec1.x;
cout << xpart1 << endl;

Where running this code will return

16 17
-1.00002

As you can see, printing out the values of xresult.x and vec1.x tells you that they are 16 and 17 respectively, yet the subtraction operation seems to introduce an error.

Any ideas why?

like image 657
bq54 Avatar asked Mar 10 '11 16:03

bq54


People also ask

Can I subtract a integer from a float?

For example, it's totally okay to subtract an int from a double or float, since those data types will maintain the precision.

Can you subtract floats in Python?

To subtract two floating numbers in Python, use the subtract operator(-). Float is one of the most used numeric data types in Python.

Can we subtract INT from float in C++?

You can subtract an integer and floating point number using subtraction operator. The datatype of the operands and returned value is given in the following code snippet. In the following program, we initialize an integer variable and a floating point variable and compute their difference using subtraction operator.


2 Answers

As you can see, printing out the values of xresult.x and vec1.x tells you that they are 16 and 17 respectively, yet the subtraction operation seems to introduce an error.

No, it doesn't tell us that at all. It tells us that the input values are approximately 16 and 17. The imprecision might, generally, come from two sources: the nature of floating-point representation, and the precision with which the numbers are printed.

Output streams print floating-point values to a certain level of precision. From a description of the std::setprecision function:

On the default floating-point notation, the precision field specifies the maximum number of meaningful digits to display in total counting both those before and those after the decimal point.

So, the values of xresult.x and vec1.x are 16 and 17 with 5 decimal digits of accuracy. In fact, one is slightly less than 16 and the other slightly more than 17. (Note that this has nothing to do with imprecise floating-point representation. The declarations float f = 16 and float g = 17 both assign exact values. A float can hold the exact integers 16 and 17 (although there are infinitely many other integers a float cannot hold.)) When we subtract slightly-more-than-17 from slightly-less-than-16, we get an answer of slightly-larger-than-negative-1.

To prove to yourself that this is the case, do one or both of these experiments. First, in your own code, add "cout << std::setprecision(10)" before printing those values. Second, run this test program

#include <iostream> 
#include <iomanip>

int main() {
  for(int i = 0; i < 10; i++) {
    std::cout << std::setprecision(i) <<
      15.99999f << " - " << 17.00001f << " = " <<
      15.99999f - 17.00001f << "\n";
  }
}

Notice how the 7th line of output matches your case:

16 - 17 = -1.00002

P.s. All of the other advice about imprecise floating-point representation is valid, it just doesn't apply to your particular circumstance. You really should read "What Every Computer Scientist Should Know About Floating-Point Arithmetic".

like image 55
Robᵩ Avatar answered Nov 15 '22 00:11

Robᵩ


This is called floating point arithmetic. It is why numerical code is so "tricky" and filled with pitfalls. That result is expected. And what is more, it can depend on the processor that you're working with as to what and to what extent you'll see it.

I'd like to add that each type of variable of the floating point variables: float, double, long double have different precision factors. That is, one may be more able to represent more accurately the value of the floating point number. That is evidenced by how these numbers are held in memory.

When you look at a float, it contains less significant digits than say a double or long double. Hence, when you perform numerics on them, you must expect that floats will suffer from larger rounding errors. When dealing with financial data, developers often use some semblance of a "decimal." These are much better designed to handle currency type manipulations with better accuracy of the significant digits. It comes with a price however.

Take a look at the IEEE 745-2008 specification.

like image 25
wheaties Avatar answered Nov 14 '22 23:11

wheaties