I have tried to implement a recursive function, in both C++ and Fortran, which calculates the value of the n'th Legendre polynomial, at x. In Fortran I have
recursive function legendre(n, x) result(p)
integer, intent(in) :: n
real(8), intent(in) :: x
real(8) :: p
if(n == 0) then
p = 1.0
else if (n == 1) then
p = x
else
p = (2.0*real(n,dp)-1.0)*x*legendre(n-1,x)-(real(n,dp)-1.0)*legendre(n-2,x)
p = p / real(n,dp)
end if
end function legendre
and then in C++ I have
double legendre(int n, double x) {
double p;
if(n == 0) return 1.0;
else if(n == 1) return x;
else {
p = (2.0*(double)n - 1.0)*x*legendre(n-1,x)-((double)n - 1.0)*legendre(n-2,x);
p /= (double)n;
return p;
}
}
These two functions seem to be exactly the same to me, both using double precission, but the result from the Fortran function is substantial different from the C++ result. For example, legendre(7,-0.2345) = 0.28876207107499049178814404296875 according to WolframAlpha. The two codes above, when compiled with no optimizations produce
Fortran : 0.28876206852410113
C++ : 0.28876207107499052285
I know that the answers should not be the same due to floating point arithmetic, but the difference in value here for double precision seems somewhat large to me. What is the reason that the Fortran value is so far off from the other two ?
The data type float has 24 bits of precision. This is equivalent to only about 7 decimal places. (The rest of the 32 bits are used for the sign and size of the number.) The number of places of precision for float is the same no matter what the size of the number.
It's a problem caused when the internal representation of floating-point numbers, which uses a fixed number of binary digits to represent a decimal number. It is difficult to represent some decimal number in binary, so in many cases, it leads to small roundoff errors.
Floating-point representations have a base (which is always assumed to be even) and a precision p. If = 10 and p = 3, then the number 0.1 is represented as 1.00 × 10-1. If = 2 and p = 24, then the decimal number 0.1 cannot be represented exactly, but is approximately 1.10011001100110011001101 × 2-4.
Floating-point error mitigation is the minimization of errors caused by the fact that real numbers cannot, in general, be accurately represented in a fixed space. By definition, floating-point error cannot be eliminated, and, at best, can only be managed.
Although the variables in your FORTRAN
function are defined as double-precision (8 bytes), the constants you have specified are default (single-precision, 4-byte) values.
According to this discussion, that means the arithmetic is performed to single-precision accuracy:
Even if the variable that you are assigning the result to is defined to be DP, the Fortran standard requires that the arithmetic on the constants be performed using SP. That is because you are using default real constants, since you do not have any kind type parameter at the end of the constants. By rule, default real is SP.
And, further on in the same discussion:
...Starting with Fortran 90, published in June 1991, this practice of "promoting" SP constants to DP is prohibited.
So, in order to force double-precision maths, specify the constants as DP: instead of, for example, 1.0
, specify 1.0D0
(and so forth for the others).
Thanks to Adrian's response, I was able to fix the problem. Nothing was wrong with the code in the Fortran function, the issue was the value of x which I was passing to it. Even though in the main program, I had defined x as real(8) and latter assigned it the value with a simple
x = -0.2345
which I thought should be double precision. It should actually be
x = -0.2345_dp
This results in the two functions having the same answer. I believe it is likely due to the reason that Adrian pointed out.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With