Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why is 24.0000 not equal to 24.0000 in MATLAB?

I am writing a program where I need to delete duplicate points stored in a matrix. The problem is that when it comes to check whether those points are in the matrix, MATLAB can't recognize them in the matrix although they exist.

In the following code, intersections function gets the intersection points:

[points(:,1), points(:,2)] = intersections(...
    obj.modifiedVGVertices(1,:), obj.modifiedVGVertices(2,:), ...
    [vertex1(1) vertex2(1)], [vertex1(2) vertex2(2)]);

The result:

>> points
points =
   12.0000   15.0000
   33.0000   24.0000
   33.0000   24.0000

>> vertex1
vertex1 =
    12
    15

>> vertex2    
vertex2 =
    33
    24

Two points (vertex1 and vertex2) should be eliminated from the result. It should be done by the below commands:

points = points((points(:,1) ~= vertex1(1)) | (points(:,2) ~= vertex1(2)), :);
points = points((points(:,1) ~= vertex2(1)) | (points(:,2) ~= vertex2(2)), :);

After doing that, we have this unexpected outcome:

>> points
points =
   33.0000   24.0000

The outcome should be an empty matrix. As you can see, the first (or second?) pair of [33.0000 24.0000] has been eliminated, but not the second one.

Then I checked these two expressions:

>> points(1) ~= vertex2(1)
ans =
     0
>> points(2) ~= vertex2(2)
ans =
     1   % <-- It means 24.0000 is not equal to 24.0000?

What is the problem?


More surprisingly, I made a new script that has only these commands:

points = [12.0000   15.0000
          33.0000   24.0000
          33.0000   24.0000];

vertex1 = [12 ;  15];
vertex2 = [33 ;  24];

points = points((points(:,1) ~= vertex1(1)) | (points(:,2) ~= vertex1(2)), :);
points = points((points(:,1) ~= vertex2(1)) | (points(:,2) ~= vertex2(2)), :);

The result as expected:

>> points
points =  
   Empty matrix: 0-by-2
like image 596
Kamran Bigdely Avatar asked Mar 26 '09 16:03

Kamran Bigdely


3 Answers

The problem you're having relates to how floating-point numbers are represented on a computer. A more detailed discussion of floating-point representations appears towards the end of my answer (The "Floating-point representation" section). The TL;DR version: because computers have finite amounts of memory, numbers can only be represented with finite precision. Thus, the accuracy of floating-point numbers is limited to a certain number of decimal places (about 16 significant digits for double-precision values, the default used in MATLAB).

Actual vs. displayed precision

Now to address the specific example in the question... while 24.0000 and 24.0000 are displayed in the same manner, it turns out that they actually differ by very small decimal amounts in this case. You don't see it because MATLAB only displays 4 significant digits by default, keeping the overall display neat and tidy. If you want to see the full precision, you should either issue the format long command or view a hexadecimal representation of the number:

>> pi
ans =
    3.1416
>> format long
>> pi
ans =
   3.141592653589793
>> num2hex(pi)
ans =
400921fb54442d18

Initialized values vs. computed values

Since there are only a finite number of values that can be represented for a floating-point number, it's possible for a computation to result in a value that falls between two of these representations. In such a case, the result has to be rounded off to one of them. This introduces a small machine-precision error. This also means that initializing a value directly or by some computation can give slightly different results. For example, the value 0.1 doesn't have an exact floating-point representation (i.e. it gets slightly rounded off), and so you end up with counter-intuitive results like this due to the way round-off errors accumulate:

>> a=sum([0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1]);  % Sum 10 0.1s
>> b=1;                                               % Initialize to 1
>> a == b
ans =
  logical
   0                % They are unequal!
>> num2hex(a)       % Let's check their hex representation to confirm
ans =
3fefffffffffffff
>> num2hex(b)
ans =
3ff0000000000000

How to correctly handle floating-point comparisons

Since floating-point values can differ by very small amounts, any comparisons should be done by checking that the values are within some range (i.e. tolerance) of one another, as opposed to exactly equal to each other. For example:

a = 24;
b = 24.000001;
tolerance = 0.001;
if abs(a-b) < tolerance, disp('Equal!'); end

will display "Equal!".

You could then change your code to something like:

points = points((abs(points(:,1)-vertex1(1)) > tolerance) | ...
                (abs(points(:,2)-vertex1(2)) > tolerance),:)

Floating-point representation

A good overview of floating-point numbers (and specifically the IEEE 754 standard for floating-point arithmetic) is What Every Computer Scientist Should Know About Floating-Point Arithmetic by David Goldberg.

A binary floating-point number is actually represented by three integers: a sign bit s, a significand (or coefficient/fraction) b, and an exponent e. For double-precision floating-point format, each number is represented by 64 bits laid out in memory as follows:

enter image description here

The real value can then be found with the following formula:

enter image description here

This format allows for number representations in the range 10^-308 to 10^308. For MATLAB you can get these limits from realmin and realmax:

>> realmin
ans =
    2.225073858507201e-308
>> realmax
ans =
    1.797693134862316e+308

Since there are a finite number of bits used to represent a floating-point number, there are only so many finite numbers that can be represented within the above given range. Computations will often result in a value that doesn't exactly match one of these finite representations, so the values must be rounded off. These machine-precision errors make themselves evident in different ways, as discussed in the above examples.

In order to better understand these round-off errors it's useful to look at the relative floating-point accuracy provided by the function eps, which quantifies the distance from a given number to the next largest floating-point representation:

>> eps(1)
ans =
     2.220446049250313e-16
>> eps(1000)
ans =
     1.136868377216160e-13

Notice that the precision is relative to the size of a given number being represented; larger numbers will have larger distances between floating-point representations, and will thus have fewer digits of precision following the decimal point. This can be an important consideration with some calculations. Consider the following example:

>> format long              % Display full precision
>> x = rand(1, 10);         % Get 10 random values between 0 and 1
>> a = mean(x)              % Take the mean
a =
   0.587307428244141
>> b = mean(x+10000)-10000  % Take the mean at a different scale, then shift back
b =
   0.587307428244458

Note that when we shift the values of x from the range [0 1] to the range [10000 10001], compute a mean, then subtract the mean offset for comparison, we get a value that differs for the last 3 significant digits. This illustrates how an offset or scaling of data can change the accuracy of calculations performed on it, which is something that has to be accounted for with certain problems.

like image 149
gnovice Avatar answered Oct 19 '22 14:10

gnovice


Look at this article: The Perils of Floating Point. Though its examples are in FORTRAN it has sense for virtually any modern programming language, including MATLAB. Your problem (and solution for it) is described in "Safe Comparisons" section.

like image 21
Rorick Avatar answered Oct 19 '22 12:10

Rorick


type

format long g

This command will show the FULL value of the number. It's likely to be something like 24.00000021321 != 24.00000123124

like image 27
KitsuneYMG Avatar answered Oct 19 '22 12:10

KitsuneYMG