Seeding the Newton iteration for cube root efficiently

Tags:

How can I find the cube root of a number in an efficient way? I think Newton-Raphson method can be used, but I don't know how to guess the initial solution programmatically to minimize the number of iterations.

480

asked Sep 18 '11 18:09

SnoopyMe

2 Answers

This is a deceptively complex question. Here is a nice survey of some possible approaches.

183

answered Dec 02 '22 18:12

Ernest Friedman-Hill

In view of the "link rot" that overtook the Accepted Answer, I'll give a more self-contained answer focusing on the topic of quickly obtaining an initial guess suitable for superlinear iteration.

The "survey" by metamerist (Wayback link) provided some timing comparisons for various starting value/iteration combinations (both Newton and Halley methods are included). Its references are to works by W. Kahan, "Computing a Real Cube Root", and by K. Turkowski, "Computing the Cube Root".

metamarist updates the DEC-VAX era bit-fiddling technique of W. Kahan with this snippet, which "assumes 32-bit integers" and relies on IEEE 754 format for doubles "to generate initial estimates with 5 bits of precision":

Click to copy

inline double cbrt_5d(double d) 
{ 
   const unsigned int B1 = 715094163; 
   double t = 0.0; 
   unsigned int* pt = (unsigned int*) &t; 
   unsigned int* px = (unsigned int*) &d; 
   pt[1]=px[1]/3+B1; 
   return t; 
}

The code by K. Turkowski provides slightly more precision ("approximately 6 bits") by a conventional powers-of-two scaling on float fr, followed by a quadratic approximation to its cube root over interval [0.125,1.0):

Click to copy

/* Compute seed with a quadratic qpproximation */
fr = (-0.46946116F * fr + 1.072302F) * fr + 0.3812513F;/* 0.5<=fr<1 */

and a subsequent restoration of the exponent of two (adjusted to one-third). The exponent/mantissa extraction and restoration make use of math library calls to frexp and ldexp.

Comparison with other cube root "seed" approximations

To appreciate those cube root approximations we need to compare them with other possible forms. First the criteria for judging: we consider the approximation on the interval [1/8,1], and we use best (minimizing the maximum) relative error.

That is, if f(x) is a proposed approximation to x^{1/3}, we find its relative error:

Click to copy

        error_rel = max | f(x)/x^(1/3) - 1 | on [1/8,1]

The simplest approximation would of course be to use a single constant on the interval, and the best relative error in that case is achieved by picking f_0(x) = sqrt(2)/2, the geometric mean of the values at the endpoints. This gives 1.27 bits of relative accuracy, a quick but dirty starting point for a Newton iteration.

A better approximation would be the best first-degree polynomial:

Click to copy

 f_1(x) = 0.6042181313*x + 0.4531635984

This gives 4.12 bits of relative accuracy, a big improvement but short of the 5-6 bits of relative accuracy promised by the respective methods of Kahan and Turkowski. But it's in the ballpark and uses only one multiplication (and one addition).

Finally, what if we allow ourselves a division instead of a multiplication? It turns out that with one division and two "additions" we can have the best linear-fractional function:

Click to copy

 f_M(x) = 1.4774329094 - 0.8414323527/(x+0.7387320679)

which gives 7.265 bits of relative accuracy.

At a glance this seems like an attractive approach, but an old rule of thumb was to treat the cost of a FP division like three FP multiplications (and to mostly ignore the additions and subtractions). However with current FPU designs this is not realistic. While the relative cost of multiplications to adds/subtracts has come down, in most cases to a factor of two or even equality, the cost of division has not fallen but often gone up to 7-10 times the cost of multiplication. Therefore we must be miserly with our division operations.

answered Dec 02 '22 18:12

hardmath

Related questions
                            
                                What pre-existing services exist for calculating distance between two addresses?
                            
                                How does Java efficiently search jar files for classes?
                            
                                Kolmogorov Complexity Approximation Algorithm
                            
                                The Partition problem
                            
                                How to count string num with limit memory?
                            
                                graph - How to find maximum induced subgraph H of G such that each vertex in H has degree ≥ k
                            
                                graph - Dijkstra for The Single-Source Longest Path
                            
                                How to write a function to generate random number 0/1 use another random function?
                            
                                How to efficiently select a random element from a std::set
                            
                                Algorithm to Generate All Possible Black and White Pixel Images in 640 x 360 Dimensions?
                            
                                Can PKCS5Padding be in AES/GCM mode?
                            
                                Checking collision in filename search patterns with wildcards
                            
                                Is it possible to compare two binary trees in less than O(n log n) time?
                            
                                RAR decompression algorithm
                            
                                What is the fastest Dijkstra implementation you know (in C++)?
                            
                                What is the efficient Algorithm for Solving Jigsaw Puzzle?
                            
                                Dynamic Programming: Sum-of-products
                            
                                Optimized dot product in Python
                            
                                Meshing of Point Clouds from a 3Dlaser scanner
                            
                                Efficient multiplication of very large matrices in MATLAB

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Seeding the Newton iteration for cube root efficiently

Tags:

algorithm

math

SnoopyMe

People also ask

2 Answers

Ernest Friedman-Hill

hardmath

Recent Activity

Donate For Us