First of all, sorry for the bad title :/
I'm trying to reproduce a paper's results on calculating the eigenvalues of a tridiagonal symmetric matrix. I'm determining some values 'upper and lower bounds' by rounding to plus and minus infinity, respectively.
Instead of changing the rounding mode every time, I just use the 'trick': fl⁻(y) = -fl⁺(-y), where fl⁻(y) is the value of y when using the minus infinity rounding mode and fl⁺(y) is the value of y when using the rounding mode to plus infinity. So, I have the following piece of code in C:
fesetround(FE_UPWARD);
first = - (-d[i] + x);
second = ( - (( e[i-1]*e[i-1] ) / a_inf ));
a_inf = first + second;
first = d[i] - x;
second = - ( ( e[i-1]*e[i-1] ) / a_sup );
a_sup = first + second;
and it works fine except for one example in which a_inf gives me the right result, but a_sup gives the wrong result, although both first and second variables seem to have the same values.
However, if I do like this:
fesetround(FE_UPWARD);
first = - (-d[i] + x);
second = ( - (( e[i-1]*e[i-1] ) / a_inf ));
fesetround(FE_DOWNWARD);
first = - (-d[i] + x);
second = ( - (( e[i-1]*e[i-1] ) / a_sup ));
I get the right results. So, if I use the trick fl⁻(y) = -fl⁺(-y), I get the right results, if I change the rounding mode and use the original expression I get wrong results. Any idea why?
In both cases, the variables first and second values are the following:
first 1.031250000000000e+07, second -1.031250000000000e+07
first 1.031250000000000e+07, second -1.031250000000000e+07
And the correct values for a_inf and a_sup are -1.862645149230957e-09 and +1.862645149230957e-09, respectively, but in the first case a_sup = 0, which is wrong
What I'm guessing it's happening is some kind of catastrophic cancellation, but I have no idea on how to solve it in this case...
Thanks in advance!
The first problem you're having is that you're only using the 'trick' to get the rounding of first
towards -inf, and not second
, so it is still rounded towards +inf.
The second problem is that C doesn't give you any sort of guarantee about how it does floating point computations. In particular, the compiler is free to reorder and re-associate operations as it sees fit for performance, even if such rearrangements may change the rounding behavior of the program. So when you say:
first = - (-d[i] + x);
the compiler may rearrange that and do a single subtract, rather than negate, add, negate, which reverses the direction of rounding. Now you can sometimes make things work the way you expect by disabling all optimization, but even with that, there's no guarantee.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With