Suppose that correctly rounded standard library functions such as found in CRlibm are available. Then how would one compute the correctly rounded cubic root of a double-precision input?
This question is not an “actual problem that [I] face”, to quote the FAQ. It is a little bit like homework this way. But the cubic root is a frequently found operation and one could imagine this question being an actual problem that someone faces.
Since “best Stack Overflow questions have a bit of source code in them”, here is a bit of source code:
y = pow(x, 1. / 3.);
The above does not compute a correctly rounded cubic root because 1/3 is not representable exactly as a double
.
ADDITIONAL NOTES:
An article describes how to compute a floating-point cubic root, but the last iteration(s) of the recommended Newton-Raphson algorithm would have to be done to higher precision for the algorithm to compute a correctly rounded double-precision cubic root. That may be the best way to compute it, but I am still looking for a shortcut that would take advantage of existing correctly rounded standardized functions.
C99 includes a cbrt()
function, but it cannot be expected to be correctly rounded or even faithful for all compilers. CRlibm's designers could have chosen to include cbrt()
in the list of provided functions, but they didn't. References to implementations available in other libraries of correctly rounded math functions are welcome.
Given that there are lots of easily computable rational points on the curve x = y^3, I’m tempted to reduce about s^3 ~ x, with s rational and just a few bits wide. Then you have:
cbrt(x) = s * cbrt(1 + (x - s^3)/s)
Obvious thing then is to evaluate the correction term using your favorite series approximation, and compute a residual via head-tail FMA arithmetic to bump the result up or down by an ulp if needed (you won’t need the full computation most of the time, obviously).
This isn’t totally in the spirit of the question, but can definitely be made to work, and it’s very easy to prove the necessary bounds this way. Hopefully someone else can suggest something more clever (I used up my cleverness for the month already).
I am afraid I do not know how to guarantee a correctly rounded double-precision cube root, but can offer one that is very nearly correctly rounded as also mentioned in the question. In other words, the maximum error appears to be very close to 0.5 ulp.
Peter Markstein, "IA-64 and Elementary Functions: Speed and Precision" (Prentice-Hall 2000)
presents efficient FMA-based techniques for correctly rounding reciprocal, square root, and reciprocal square root, but it does not cover the cube root in this regard. In general Markstein's approach requires a preliminary result that is accurate to within 1 ulp prior to the final rounding sequence. I do not have the mathematical wherewithal to extend his technique to the rounding of cube roots, but it seems to me that this should be possible in principle, being a challenge somewhat similar to the reciprocal square root.
Bit-wise algorithms lend themselves easily to the computation of roots with correct rounding. Since tie cases for the IEEE-754 rounding mode to-nearest-or-even cannot occur, one simply needs to carry the computation until it has produced all mantissa bits plus one round bit. The bit-wise algorithm for square root, based on the binomial theorem, is well known in both non-restoring and restoring variants and has been the basis of hardware implementations. The same approach via binomial theorem works for the cube root, and there is a little-known paper that lays out the details of a non-restoring implementation:
H. Peng, “Algorithms for Extracting Square Roots and Cube Roots,” Proceedings 5th IEEE International Symposium on Computer Arithmetic, pp. 121-126, 1981.
Best I can tell from experimenting with it this works well enough for the extraction of cube roots from integers. Since each iteration produces only a single result bit it is not exactly fast. For applications in floating-point arithmetic it has the drawback of using a couple of bookkeeping variables that require roughly twice the number of bits of the final result. This means one would need to use 128-bit integer arithmetic to implement a double-precision cube root.
My C99 code below is based on Halley's rational method for the cube root which has cubic convergence meaning that the initial approximation does not have to be very accurate as the number of valid digits triples in every iteration. The computation can be arranged in various ways. Generally it is numerically advantageous to arrange iterative schemes as
new_guess := old_guess + correction
since for a sufficiently close initial guess, correction
is significantly smaller than old_guess
. This leads to the following iteration scheme for the cube root:
x := x - x * (x3 - a) / (2*x3 + a)
This particular arrangement is also listed in Kahan's notes on cube root. It has the further advantage of lending itself naturally to the use of FMA (fused-multiply add). One drawback is that the computation of 2*x3 could lead to overflow, therefore an argument reduction scheme is required for at least part of the double-precision input domain. In my code I simply apply the argument reduction to all non-exceptional inputs, based on straightforward manipulation of the exponents of IEEE-754 double-precision operands.
The interval [0.125,1) serves as the primary approximation interval. A polynomial minimax approximation is used that returns an initial guess in [0.5,1]. The narrow range facilitates the use of single-precision arithmetic for the low-accuracy portions of the computation.
I cannot prove anything about the error bounds of my implementation, however, testing with 200 million random test vectors against a reference implementation (accurate to about 200 bits) found a total of 277 incorrectly rounded results (so an error rate of roughly 1.4 ppm) with a maximum error of 0.500012 ulps.
double my_cbrt (double a)
{
double b, u, v, r;
float bb, uu, vv;
int e, f, s;
if ((a == 0.0) || isinf(a) || isnan(a)) {
/* handle special cases */
r = a + a;
} else {
/* strip off sign-bit */
b = fabs (a);
/* compute exponent adjustments */
b = frexp (b, &e);
s = e - 3*342;
f = s / 3;
s = s - 3 * f;
f = f + 342;
/* map argument into the primary approximation interval [0.125,1) */
b = ldexp (b, s);
bb = (float)b;
/* approximate cube root in [0.125,1) with relative error 5.22e-3 */
uu = 0x1.2f32c0p-1f;
uu = fmaf (uu, bb, -0x1.62cc2ap+0f);
uu = fmaf (uu, bb, 0x1.7546e0p+0f);
uu = fmaf (uu, bb, 0x1.5d0590p-2f);
/* refine cube root using two Halley iterations w/ cubic convergence */
vv = uu * uu;
uu = fmaf (fmaf (vv, uu, -bb) / fmaf (vv, 2.0f*uu, bb), -uu, uu);
u = (double)uu;
v = u * u; // this product is exact
r = fma (fma (v, u, -b) / fma (v, 2.0*u, b), -u, u);
/* map back from primary approximation interval by jamming exponent */
r = ldexp (r, f);
/* restore sign bit */
r = copysign (r, a);
}
return r;
}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With