Ignoring all the other issues like memory transfer, etc.
I'm looking for some measure of the "cost", which I guess I would quantify as the expected number of bit flips, for multiplying two random floating point (say 32-bit) numbers, vs the cost for adding.
I guess there may be some important issues (like whether the numbers have the same exponent, etc) that may be worth considering.
Edit: To clarify, I'm interested in the amount of energy required to perform these operations, rather than the time or amount of hardware, which is why I think "expected number of bit flips" is the quantity of interest. I think this is a well-defined question, and there is certainly some "expected number of bit flips" required by a given algorithm to perform floating point multiplication... And I'm looking for the minimum over all algorithms.
Edit 2: Thanks all for responding. The most relevant response I got was from njuffa, who referenced Mark Horowitz's estimates (see page 33). A more up-to-date paper by Horowitz posts slightly different numbers, that is:
Float32 Mult: 3.7pJ.
Float32 Add: 0.9pJ
Int32 Mult: 3.1pJ
Int32 Add: 0.1pJ
On modern processors, floating point multiplication is generally slightly more expensive than addition (which is one reason why compilers will typically replace 2*x
by x+x
).
On x86 and and x86_64, floating point operations are almost always done using SSE instructions (ADDSS, MULSS, etc.), for which addition and multiplication are constant time with no "early outs" (which makes pipelining easier).
The actual relative cost is more difficult to quantify, and will depend on a lot of things. The canonical reference here is Agner Fog's "Lists of instruction latencies, throughputs and micro-operation breakdowns": http://www.agner.org/optimize/
One rough heuristic which I've heard (though don't have any references for) is that multiplication takes roughly 50% longer.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With