Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Floating point math accuracy, c++ vs fortran [duplicate]

Tags:

c++

fortran

I have tried to implement a recursive function, in both C++ and Fortran, which calculates the value of the n'th Legendre polynomial, at x. In Fortran I have

recursive function legendre(n, x) result(p)
  integer, intent(in) :: n
  real(8), intent(in) :: x
  real(8) :: p

  if(n == 0) then
    p = 1.0
  else if (n == 1) then
    p = x
  else
    p = (2.0*real(n,dp)-1.0)*x*legendre(n-1,x)-(real(n,dp)-1.0)*legendre(n-2,x)
    p = p / real(n,dp)
  end if 
end function legendre

and then in C++ I have

double legendre(int n, double x) {
  double p;
  if(n == 0) return 1.0;
  else if(n == 1) return x;
  else {
    p = (2.0*(double)n - 1.0)*x*legendre(n-1,x)-((double)n - 1.0)*legendre(n-2,x);
    p /= (double)n;
    return p;
  }
}

These two functions seem to be exactly the same to me, both using double precission, but the result from the Fortran function is substantial different from the C++ result. For example, legendre(7,-0.2345) = 0.28876207107499049178814404296875 according to WolframAlpha. The two codes above, when compiled with no optimizations produce

Fortran : 0.28876206852410113

C++ : 0.28876207107499052285

I know that the answers should not be the same due to floating point arithmetic, but the difference in value here for double precision seems somewhat large to me. What is the reason that the Fortran value is so far off from the other two ?

like image 262
Hunter Belanger Avatar asked Feb 25 '20 21:02

Hunter Belanger


People also ask

How accurate are floating-point numbers?

The data type float has 24 bits of precision. This is equivalent to only about 7 decimal places. (The rest of the 32 bits are used for the sign and size of the number.) The number of places of precision for float is the same no matter what the size of the number.

What is the problem with floating-point numbers?

It's a problem caused when the internal representation of floating-point numbers, which uses a fixed number of binary digits to represent a decimal number. It is difficult to represent some decimal number in binary, so in many cases, it leads to small roundoff errors.

What every programmer should know about floating-point numbers?

Floating-point representations have a base (which is always assumed to be even) and a precision p. If = 10 and p = 3, then the number 0.1 is represented as 1.00 × 10-1. If = 2 and p = 24, then the decimal number 0.1 cannot be represented exactly, but is approximately 1.10011001100110011001101 × 2-4.

What is floating-point precision error?

Floating-point error mitigation is the minimization of errors caused by the fact that real numbers cannot, in general, be accurately represented in a fixed space. By definition, floating-point error cannot be eliminated, and, at best, can only be managed.


2 Answers

Although the variables in your FORTRAN function are defined as double-precision (8 bytes), the constants you have specified are default (single-precision, 4-byte) values.

According to this discussion, that means the arithmetic is performed to single-precision accuracy:

Even if the variable that you are assigning the result to is defined to be DP, the Fortran standard requires that the arithmetic on the constants be performed using SP. That is because you are using default real constants, since you do not have any kind type parameter at the end of the constants. By rule, default real is SP.

And, further on in the same discussion:

...Starting with Fortran 90, published in June 1991, this practice of "promoting" SP constants to DP is prohibited.

So, in order to force double-precision maths, specify the constants as DP: instead of, for example, 1.0, specify 1.0D0 (and so forth for the others).

like image 193
Adrian Mole Avatar answered Oct 16 '22 12:10

Adrian Mole


Thanks to Adrian's response, I was able to fix the problem. Nothing was wrong with the code in the Fortran function, the issue was the value of x which I was passing to it. Even though in the main program, I had defined x as real(8) and latter assigned it the value with a simple

x = -0.2345

which I thought should be double precision. It should actually be

x = -0.2345_dp

This results in the two functions having the same answer. I believe it is likely due to the reason that Adrian pointed out.

like image 31
Hunter Belanger Avatar answered Oct 16 '22 13:10

Hunter Belanger