Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Efficient mod 3 in x86 assembly

The DIV instruction is expensive on modern processors. Is there a faster way to reduce a 64-bit integer mod 3 in x86 assembly?

like image 350
Tim McLean Avatar asked May 27 '17 02:05

Tim McLean


1 Answers

There are algorithms for that, based on performing division by multiplication with the reciprocal of the divisor. There are various papers on this, the one most commonly cited is:

Torbjörn Granlund and Peter L. Montgomery. "Division by invariant integers using multiplication." ACM SIGPLAN Notices. Vol. 29, No. 6, August 1994, pp. 61-72 (online)

Your C/C++ compiler very likely already uses a variant of this algorithm when optimizations are turned on. For example, my Intel compiler, version 13, turns this:

#include <stdint.h>
uint64_t mod3 (uint64_t a)
{
    return a % 3;
}

into this (line-end annotations mine):

mod3    PROC
; parameter 1: rcx
        mov       r8, 0aaaaaaaaaaaaaaabH      ;; (scaled) reciprocal of 3
        mov       rax, rcx
        mul       r8                          ;; multiply with reciprocal
        shr       rdx, 1                      ;; quotient
        lea       r9, QWORD PTR [rdx+rdx*2]   ;; back multiply with 3
        neg       r9
        add       rcx, r9                     ;; subtract from dividend 
        mov       rax, rcx                    ;; remainder
        ret
mod3    ENDP
like image 138
njuffa Avatar answered Nov 11 '22 23:11

njuffa