On most moder 64-bit processors (such as Intel Core 2 Duo or the Intel i7 series), does the speed of the x86_64 command mulq
and its variants depend on the operands? For example, will multiplying 11 * 13
be faster than 11111111 * 13131313
? Or does it always take the time of the worst case?
TL;DR: No. Constant-length integer math operations (barring division, which is non-linear) consume a constant number of cycles, regardless of the numerical value of the operands.
mulq
takes two QWORD arguments.
The values are represented in little-endian binary format (used by x86 architecture) as follows:
1011000000000000000000000000000000000000000000000000000000000000 = 13
1000110001111010000100110000000000000000000000000000000000000000 = 13131313
The processor sees both of these as the same "size", as both are 64-bit values.
Therefore, the cycle count should always be the same, regardless of the actual numerical value of the operands.
More info:
There are the concepts of Leading Zero Anticipation and Leading Zero Detection[1][2] (LZA/LZD) that can be employed to speed up floating-point operations.
To the best of my knowledge however, there are no mainstream processors that employ either of these methods towards integer arithmetic. This is most likely due to the simplistic nature of most integer arithmetic (multiplication in this case). The overhead of LZA/LZD may simply not be worth it, for simple integer math circuits that can complete the full multiplication in less time anyhow.
I don't have any reference to hand, but I would place money on the latency/throughput being invariant of the values of the operands. Otherwise, it would be a nightmare to schedule.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With