What are the steps in the algorithm to do floating point division? Why is the result slower than say, multiplication? Is it done the same way we do division by hand? By repeatedly dividing by the divisor, subtracting the result to obtain a remainder, aligning the number again and continuing till the remainder is less than a particular value? Also, why do we gain on performance if instead of doing <pre class="prettyprint"><code>a = b / c </code></pre> we do <pre class="prettyprint"><code>d = 1 / c a = b * d </code></pre> ? Edit: Basically I was asking because someone asked me to distribute a value among contenders based on the assignment of weights. I did all this in integers and was later asked to convert to float, which caused a slowdown in performance. I was just interested in knowing how would C or C++ do these operations that would cause the slowness.

As described in the Wikipedia article Division algorithm, there are two main aproaches to division in computers: <h3>Slow Division</h3> Uses the following recurrence and finds one digit per iteration: <code> partialRemainder[j+1] = radix * partialRemainder[j] - quotientDigit[n-(j+1)]*denominator </code> <h3>Fast Division</h3> Starts with an estimation and converges on the quotient. How accurate you are depends on the number of iterations. Newton-Raphson division (very briefly): <ol> <li>Calculate estimate of the reciprocal.</li> <li>Compute more accurate estimates of the reciprocal.</li> <li>Compute quotient by multiplying the dividend by the reciprocal.</li> </ol>

Why is float division slow?

Tags:

performance

algorithm

hardware

What are the steps in the algorithm to do floating point division?

Why is the result slower than say, multiplication?

Is it done the same way we do division by hand? By repeatedly dividing by the divisor, subtracting the result to obtain a remainder, aligning the number again and continuing till the remainder is less than a particular value?

Also, why do we gain on performance if instead of doing

a = b / c

we do

d = 1 / c
a = b * d

Edit: Basically I was asking because someone asked me to distribute a value among contenders based on the assignment of weights. I did all this in integers and was later asked to convert to float, which caused a slowdown in performance. I was just interested in knowing how would C or C++ do these operations that would cause the slowness.

1000

asked Feb 03 '09 07:02

umar

3 Answers

FPU division often basically uses Newton-Raphson (or some other algorithm) to get a reciprocal then multiplies by that reciprocal. That's why the reciprocal operation is slightly faster than the general division operation.

This HP paper (which is actually more understandable than most papers I come across talking about Newton-Raphson) has this to say about floating point division:

Floating point division and square root take considerably longer to compute than addition and multiplication. The latter two are computed directly while the former are usually computed with an iterative algorithm. The most common approach is to use a division-free Newton-Raphson iteration to get an approximation to the reciprocal of the denominator (division) or the reciprocal square root, and then multiply by the numerator (division) or input argument (square root).

183

answered Oct 04 '22 06:10

Michael Burr

From a hardware point of view division is a iterative algorithm, and the time it takes is proportional to the number of bits. The fastest division that is currently around uses the radix4 algorithm which generates 4 bit of result per iteration. For a 32 bit divide you need 8 steps at least.

Multiplication can be done in parallel to a certain degree. Without going into detail you can break up a large multiplication into several smaller, independent ones. These multiplications can again be broken down until you're at a bit-level, or you stop earlier and use a small lookup-table in hardware. This makes the multiplication hardware heavy from a silicon real estate point of view but very fast as well. It's the classic size/speed tradeoff.

You need log2 steps to combine the parallel computed results, so a 32 bit multiply need 5 logical steps (if you go down to the minimum). Fortunately these 5 steps are a good deal simpler than the division steps (it's just additions). That means in practice multiplies are even faster.

answered Oct 04 '22 06:10

Nils Pipenbrinck

As described in the Wikipedia article Division algorithm, there are two main aproaches to division in computers:

Slow Division

Uses the following recurrence and finds one digit per iteration: partialRemainder[j+1] = radix * partialRemainder[j] - quotientDigit[n-(j+1)]*denominator

Fast Division

Starts with an estimation and converges on the quotient. How accurate you are depends on the number of iterations.

Newton-Raphson division (very briefly):

Calculate estimate of the reciprocal.
Compute more accurate estimates of the reciprocal.
Compute quotient by multiplying the dividend by the reciprocal.

answered Oct 04 '22 06:10

John Mulder

Related questions
                            
                                Why are all my Visual Studio test results "Not executed"
                            
                                Can I have nested try-catch blocks in C++?
                            
                                What's the difference between Ruby's puts and write methods?
                            
                                Powershell . Declare generic list with class defined using 'Add-Type'
                            
                                Are comment headers and footers in your code really necessary? [closed]
                            
                                DbUnit.NET Alternatives
                            
                                What's with [UITableView reloadData]?
                            
                                Performance Cost Of 'try' in C#
                            
                                Implementing a voting system without requiring registration
                            
                                How to build SCons projects with Eclipse CDT?
                            
                                SQL Server Query for Rank (RowNumber) and Groupings
                            
                                Nil NSDate when trying to get date from UTC string in zulu time

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Why is float division slow?

Tags:

performance

algorithm

hardware

umar

People also ask

3 Answers

Michael Burr

Nils Pipenbrinck

Slow Division

Fast Division

John Mulder

Recent Activity

Donate For Us