The DIV
instruction is expensive on modern processors. Is there a faster way to reduce a 64-bit integer mod 3 in x86 assembly?
There are algorithms for that, based on performing division by multiplication with the reciprocal of the divisor. There are various papers on this, the one most commonly cited is:
Torbjörn Granlund and Peter L. Montgomery. "Division by invariant integers using multiplication." ACM SIGPLAN Notices. Vol. 29, No. 6, August 1994, pp. 61-72 (online)
Your C/C++ compiler very likely already uses a variant of this algorithm when optimizations are turned on. For example, my Intel compiler, version 13, turns this:
#include <stdint.h>
uint64_t mod3 (uint64_t a)
{
return a % 3;
}
into this (line-end annotations mine):
mod3 PROC
; parameter 1: rcx
mov r8, 0aaaaaaaaaaaaaaabH ;; (scaled) reciprocal of 3
mov rax, rcx
mul r8 ;; multiply with reciprocal
shr rdx, 1 ;; quotient
lea r9, QWORD PTR [rdx+rdx*2] ;; back multiply with 3
neg r9
add rcx, r9 ;; subtract from dividend
mov rax, rcx ;; remainder
ret
mod3 ENDP
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With