How do I determine the square root of a floating point number? Is the Newton-Raphson method a good way? I have no hardware square root either. I also have no hardware divide (but I have implemented floating point divide). If possible, I would prefer to reduce the number of divides as much as possible since they are so expensive. Also, what should be the initial guess to reduce the total number of iterations??? Thank you so much!

When you use Newton-Raphson to compute a square-root, you actually want to use the iteration to find the reciprocal square root (after which you can simply multiply by the input--with some care for rounding--to produce the square root). More precisely: we use the function <code>f(x) = x^-2 - n</code>. Clearly, if <code>f(x) = 0</code>, then <code>x = 1/sqrt(n)</code>. This gives rise to the newton iteration: <pre class="prettyprint"><code>x_(i+1) = x_i - f(x_i)/f'(x_i) = x_i - (x_i^-2 - n)/(-2x_i^-3) = x_i + (x_i - nx_i^3)/2 = x_i*(3/2 - 1/2 nx_i^2) </code></pre> Note that (unlike the iteration for the square root), this iteration for the reciprocal square root involves no divisions, so it is generally much more efficient. I mentioned in your question on divide that you should look at existing soft-float libraries, rather than re-inventing the wheel. That advice applies here as well. This function has already been implemented in existing soft-float libraries. <hr> Edit: the questioner seems to still be confused, so let's work an example: <code>sqrt(612)</code>. <code>612</code> is <code>1.1953125 x 2^9</code> (or <code>b1.0011001 x 2^9</code>, if you prefer binary). Pull out the even portion of the exponent (9) to write the input as <code>f * 2^(2m)</code>, where <code>m</code> is an integer and <code>f</code> is in the range [1,4). Then we will have: <pre class="prettyprint"><code>sqrt(n) = sqrt(f * 2^2m) = sqrt(f)*2^m </code></pre> applying this reduction to our example gives <code>f = 1.1953125 * 2 = 2.390625</code> (<code>b10.011001</code>) and <code>m = 4</code>. Now do a newton-raphson iteration to find <code>x = 1/sqrt(f)</code>, using a starting guess of 0.5 (as I noted in a comment, this guess converges for all <code>f</code>, but you can do significantly better using a linear approximation as an initial guess): <pre class="prettyprint"><code>x_0 = 0.5 x_1 = x_0*(3/2 - 1/2 * 2.390625 * x_0^2) = 0.6005859... x_2 = x_1*(3/2 - 1/2 * 2.390625 * x_1^2) = 0.6419342... x_3 = 0.6467077... x_4 = 0.6467616... </code></pre> So even with a (relatively bad) initial guess, we get rapid convergence to the true value of <code>1/sqrt(f) = 0.6467616600226026</code>. Now we simply assemble the final result: <pre class="prettyprint"><code>sqrt(f) = x_n * f = 1.5461646... sqrt(n) = sqrt(f) * 2^m = 24.738633... </code></pre> And check: sqrt(612) = 24.738633... Obviously, if you want correct rounding, careful analysis needed to ensure that you carry sufficient precision at each stage of the computation. This requires careful bookkeeping, but it isn't rocket science. You simply keep careful error bounds and propagate them through the algorithm. If you want to correct rounding without explicitly checking a residual, you need to compute sqrt(f) to a precision of 2p + 2 bits (where p is precision of the source and destination type). However, you can also take the strategy of computing sqrt(f) to a little more than p bits, square that value, and adjust the trailing bit by one if necessary (which is often cheaper). sqrt is nice in that it is a unary function, which makes exhaustive testing for single-precision feasible on commodity hardware. You can find the OS X soft-float <code>sqrtf</code> function on opensource.apple.com, which uses the algorithm described above (I wrote it, as it happens). It is licensed under the APSL, which may or not be suitable for your needs.

Probably (still) the fastest implementation for finding the inverse square root and the 10 lines of code that I adore the most. It's based on Newton Approximation, but with a few quirks. There's even a great story around this.

Determining Floating Point Square Root

Tags:

algorithm

computer-science

math

floating-point

How do I determine the square root of a floating point number? Is the Newton-Raphson method a good way? I have no hardware square root either. I also have no hardware divide (but I have implemented floating point divide).

If possible, I would prefer to reduce the number of divides as much as possible since they are so expensive.

Also, what should be the initial guess to reduce the total number of iterations???

Thank you so much!

317

asked Feb 10 '12 22:02

Veridian

2 Answers

When you use Newton-Raphson to compute a square-root, you actually want to use the iteration to find the reciprocal square root (after which you can simply multiply by the input--with some care for rounding--to produce the square root).

More precisely: we use the function f(x) = x^-2 - n. Clearly, if f(x) = 0, then x = 1/sqrt(n). This gives rise to the newton iteration:

x_(i+1) = x_i - f(x_i)/f'(x_i)
        = x_i - (x_i^-2 - n)/(-2x_i^-3)
        = x_i + (x_i - nx_i^3)/2
        = x_i*(3/2 - 1/2 nx_i^2)

Note that (unlike the iteration for the square root), this iteration for the reciprocal square root involves no divisions, so it is generally much more efficient.

I mentioned in your question on divide that you should look at existing soft-float libraries, rather than re-inventing the wheel. That advice applies here as well. This function has already been implemented in existing soft-float libraries.

Edit: the questioner seems to still be confused, so let's work an example: sqrt(612). 612 is 1.1953125 x 2^9 (or b1.0011001 x 2^9, if you prefer binary). Pull out the even portion of the exponent (9) to write the input as f * 2^(2m), where m is an integer and f is in the range [1,4). Then we will have:

sqrt(n) = sqrt(f * 2^2m) = sqrt(f)*2^m

applying this reduction to our example gives f = 1.1953125 * 2 = 2.390625 (b10.011001) and m = 4. Now do a newton-raphson iteration to find x = 1/sqrt(f), using a starting guess of 0.5 (as I noted in a comment, this guess converges for all f, but you can do significantly better using a linear approximation as an initial guess):

x_0 = 0.5
x_1 = x_0*(3/2 - 1/2 * 2.390625 * x_0^2)
    = 0.6005859...
x_2 = x_1*(3/2 - 1/2 * 2.390625 * x_1^2)
    = 0.6419342...
x_3 = 0.6467077...
x_4 = 0.6467616...

So even with a (relatively bad) initial guess, we get rapid convergence to the true value of 1/sqrt(f) = 0.6467616600226026.

Now we simply assemble the final result:

sqrt(f) = x_n * f = 1.5461646...
sqrt(n) = sqrt(f) * 2^m = 24.738633...

And check: sqrt(612) = 24.738633...

Obviously, if you want correct rounding, careful analysis needed to ensure that you carry sufficient precision at each stage of the computation. This requires careful bookkeeping, but it isn't rocket science. You simply keep careful error bounds and propagate them through the algorithm.

If you want to correct rounding without explicitly checking a residual, you need to compute sqrt(f) to a precision of 2p + 2 bits (where p is precision of the source and destination type). However, you can also take the strategy of computing sqrt(f) to a little more than p bits, square that value, and adjust the trailing bit by one if necessary (which is often cheaper).

sqrt is nice in that it is a unary function, which makes exhaustive testing for single-precision feasible on commodity hardware.

You can find the OS X soft-float sqrtf function on opensource.apple.com, which uses the algorithm described above (I wrote it, as it happens). It is licensed under the APSL, which may or not be suitable for your needs.

150

answered Sep 23 '22 08:09

Stephen Canon

Probably (still) the fastest implementation for finding the inverse square root and the 10 lines of code that I adore the most.

It's based on Newton Approximation, but with a few quirks. There's even a great story around this.

answered Sep 19 '22 08:09

emboss

Related questions
                            
                                no suitable method found for sort(int[],<anonymous Comparator<Integer>>)
                            
                                C# implementation of Heap's algorithm doesn't work
                            
                                Resize CGSize to the maximum with keeping the aspect-ratio
                            
                                Unable to write an algorithm for filtering out items in a RecyclerView based on a long saved with each item
                            
                                Find out in linear time whether there is a pair in sorted vector that adds up to certain value
                            
                                How to Calculate Recurring Digits?
                            
                                Efficient Timer Algorithm
                            
                                Pruning short line segments from edge detector output?
                            
                                Sorting a text file with over 100,000,000 records
                            
                                Find the closest number in a list of numbers
                            
                                getRandomColor but avoid dark colors: Help my algorithm
                            
                                Finding duplicates within list of list
                            
                                hacker news algorithm in php?
                            
                                An array of length N can contain values 1,2,3 ... N^2. Is it possible to sort in O(n) time?
                            
                                Anagram algorithm with minimum complexity
                            
                                bit vector implementation of sets
                            
                                Equation from "Programming Pearls" - can somebody explain me?
                            
                                Algorithm to get best text color
                            
                                How to put postfix expressions in a binary tree?
                            
                                Understanding the minimax/maximin paths (Floyd-Warshall)

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With