How to compute sign(a² - b * c) * sqrt(abs(a² - b * c)) accurately with floating point arithmetics?

Question

Question

I was wondering if there is a numerically accurate way to compute

sign(a² - b * c) * sqrt(abs(a² - b * c))

with floating point arithmetics because it suffers from (ordered from most to least problematic):

cancellation when a² ~= b * c which makes both the sign- and the sqrt-factor unstable
loss in accuracy when either a² or b * c dominate due to the square/product amplifying any differences between the two terms
overflow/underflow issues (even though you need very large and small numbers for this to happen, so I think that's not a particularly big problem)

If possible, I would like to stick with common floating point data types like 32- and 64-bit float and not resort to higher precision data types (like decimal) or arbitrary precision libraries (like Python mpmath).

Background

While working on this scipy issue, I've found a part of the code that looks numerically unstable (Link) and even states so in the comments:

        # Distinguish between
        #    r1norm = ||b - Ax|| and
        #    r2norm = rnorm in current code
        #           = sqrt(r1norm^2 + damp^2*||x - x0||^2).
        #    Estimate r1norm from
        #    r1norm = sqrt(r2norm^2 - damp^2*||x - x0||^2).
        # Although there is cancellation, it might be accurate enough.
        if damp > 0:
            r1sq = rnorm**2 - dampsq * xxnorm
            r1norm = sqrt(abs(r1sq))
            if r1sq < 0:
                r1norm = -r1norm

This code basically aims to compute sign(a² - b * c) * sqrt(abs(a² - b * c)) with

`rnorm -> a`
`dampsq -> b`
`xxnorm -> c`

What I tried so far

This problem looks pretty similar to the accurate computation of hypot(a, b) = sqrt(a² + b²) for which compensated fast algorithms exist, e.g., as proposed in

Borges C.F., Fast Compensated Algorithms for the Reciprocal Square Root, the Reciprocal Hypotenuse, and Givens Rotations, arXiv:2103.08694

The computation exploits fused multipy-add operations (fma) and compensates floating point errors.

However, I'm not particularly familiar with this deep theory on floating point numerical mathematics, so I was not able to translate the algorithms to the computation of sign(a² - b * c) * sqrt(abs(a² - b * c)).

Besides, the problem at hands comes with

a subtraction instead of addition
an absolute term
the sign prefactor

chux - Reinstate Monica · Accepted Answer

r1sq = a² - b * c is a numeric problem when a² ≈ b * c.

Without resorting to wider types, one improvement when b * c >= 0: determine d = sqrt(b*c). We at least avoid rounding that occurs with a*a.

Then form the product r1sq = (a-d)*(a+d).

Code could use additional tricks if range was important too, yet sounds like OP is concerned about precision.

I've used this with the quadratic equation from time to time, yet find wider math more performant.

How to compute sign(a² - b * c) * sqrt(abs(a² - b * c)) accurately with floating point arithmetics?

Tags:

floating-point

floating-accuracy

precision

numeric

Question

Background

What I tried so far

MothNik

1 Answers

chux - Reinstate Monica

Recent Activity

Donate For Us

How to compute sign(a² - b * c) * sqrt(abs(a² - b * c)) accurately with floating point arithmetics?

Tags:

floating-point

floating-accuracy

precision

numeric

Question

Background

What I tried so far

MothNik

1 Answers

chux - Reinstate Monica

Related questions

Recent Activity

Donate For Us