This is a theoretical question, I don't really run a fab or anything ;-)
For small N, N-by-N multiplier can be implemented as a tree of 3-to-2 adders of depth log(N) and with N^2 gates - let's ignore Booth encoding etc. This is super-fast, but requires unreasonable amount of hardware.
This gate count will become unreasonable soon enough (as well as wiring). But software multiplication of kN-by-kN via k^2 2N-bit partial products and adding them together is going to be very slow.
My question is - what tradeoffs do we have for very fast hardware-assisted multiplication of moderate N, after N^2 gates becomes too much (for gates as well as wiring), but we still want to be better than pure software.
I can imagine this comes up a lot with custom crypto chips, but I'm just curious here.
Power, as in watts. Each transistor needs a certain amount of energy to switch, and the number of transistors increases as the word depth does. Reducing transistor size helps, but there's no stopping the advance of technology (and consumer desire).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With