If I have variables a, b, an c of type double, let c:=a/b, and give a and b values of 7 and 10, then c's value of 0.7 registers as being LESS THAN 0.70.
On the other hand, if the variables are all type extended, then c's value of 0.7 does not register as being less than 0.70.
This seems strange. What information am I missing?
First, it needs to be noted that float literals in Delphi are of the Extended type. So when you compare a double to a literal, the double is probably first "expanded" to Extended, and then compared. (Edit : This is true only in 32 bits application. In 64 bits application, Extended
is an alias of Double
)
Here, all ShowMessage will be displayed.
procedure DoSomething;
var
A, B : Double;
begin
A := 7/10;
B := 0.7; //Here, we lower the precision of "0.7" to double
//Here, A is expanded to Extended... But it has already lost precision. This is (kind of) similar to doing Round(0.7) <> 0.7
if A <> 0.7 then
ShowMessage('Weird');
if A = B then //Here it would work correctly.
ShowMessage('Ok...');
//Still... the best way to go...
if SameValue(A, 0.7, 0.0001) then
ShowMessage('That will never fails you');
end;
Here some literature for you
What Every Computer Scientist Should Know About Floating-Point Arithmetic
There is no representation for the mathematical number 0.7
in binary floating-point. Your statement computes in c
the closest double
, which (according to what you say, I didn't check) is a little below 0.7.
Apparently in extended precision the closest floating-point number to 0.7 is a little above it. But there still is no exact representation for 0.7. There isn't any at any precision in binary floating-point.
As a rule of thumb, any non-integer number whose last non-zero decimal is not 5 cannot be represented exactly as a binary floating-point number (the converse is not true: 0.05 cannot be represented exactly either).
It has to do with the number of digits of precision in the two different floating point types you're using, and the fact that a lot of numbers cannot be represented exactly, regardless of precision. (From the pure math side: irrational numbers outnumber rationals)
Take 2/3, for example. It' can't be represented exactly in decimal. With 4 significant digits, it would be represented as 0.6667. With 8 significant digits, it would be 0.66666667. The trailing 7 is roundup reflecting that the next digit would be > 5 if there was room to keep it.
0.6667 is greater than 0.66666667, so the computer will evaluate 2/3 (4 digits) > 2/3 (8 digits).
The same is true with your .7 vs .70 in double and extended vars.
To avoid this specific issue, try to use the same numeric type throughout your code. When working with floating point numbers in general, there are a lot of little things you have to watch out for. The biggest is to not write your code to compare two floats for equality - even if they should be the same value, there are many factors in calculations that can make them end up a very tiny bit different. Instead of comparing for equality, you need to test that the difference between the two numbers is very small. How small the difference has to be is up to you and to the nature of your calculations, and it usually referred to as epsilon, taken from calculus theorem and proof.
You're missing This Thing.
See especially the 'Accuracy problems' chapter. See also the Pascal's answer.
In order to fix your code without using the Extended
type, you must add the Math
unit and use the SameValue
function from there which is especially built for this purpose.
Be sure to use an Epsilon
value different than 0 when you use the SameValue in your case.
For example:
var
a, b, c: double;
begin
a:=7; b:=10;
c:=a/b;
if SameValue(c, 0.70, 0.001) then
ShowMessage('Ok')
else
ShowMessage('Wrong!');
end;
HTH
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With