Background
I was playing with prime numbers in c, when I stumbled upon a new optimization in gcc trunk (will be version 9.x) that optimizes a modulus comparison to 0 into an integer multiply and comparison using magic numbers. In other words x%prime==0
becomes x*Magic_mul<=Magic_cmp
_Bool mod(unsigned x){return x % Constant == 0;}
mod:
imul edi, edi, Magic_mul
cmp edi, Magic_cmp
setbe al
Details
Based on seeing the asm output, it does these optimizations for all integers (well, primes at least) I converted them to hex to help see the patterns, but its not immediately obvious at the moment.
//32bit examples for _Bool mod_n(unsigned x){return x%n==0;};
//note: parameter is unsigned but it becomes a signed multiply
x%3==0; // x*0xAAAAAAAB <= 0x55555555
x%5==0; // x*0xCCCCCCCD <= 0x33333333
x%7==0; // x*0xB6DB6DB7 <= 0x24924924
x%11==0; // x*0xBA2E8BA3 <= 0x1745D174
x%13==0; // x*0xC4EC4EC5 <= 0x13B13B13
x%17==0; // x*0xF0F0F0F1 <= 0x0F0F0F0F
x%19==0; // x*0x286BCA1B <= 0x0D79435E
x%23==0; // x*0xE9BD37A7 <= 0x0B21642C
x%29==0; // x*0x4F72C235 <= 0x08D3DCB0
x%31==0; // x*0xBDEF7BDF <= 0x08421084
x%37==0; // x*0x914C1BAD <= 0x06EB3E45
x%41==0; // x*0xC18F9C19 <= 0x063E7063
x%43==0; // x*0x2FA0BE83 <= 0x05F417D0
x%47==0; // x*0x677D46CF <= 0x0572620A
x%53==0; // x*0x8C13521D <= 0x04D4873E
x%59==0; // x*0xA08AD8F3 <= 0x0456C797
x%61==0; // x*0xC10C9715 <= 0x04325C53
x%67==0; // x*0x07A44C6B <= 0x03D22635
x%71==0; // x*0xE327A977 <= 0x039B0AD1
x%73==0; // x*0xC7E3F1F9 <= 0x0381C0E0
x%79==0; // x*0x613716AF <= 0x033D91D2
x%83==0; // x*0x2B2E43DB <= 0x03159721
x%89==0; // x*0xFA3F47E9 <= 0x02E05C0B
x%97==0; // x*0x5F02A3A1 <= 0x02A3A0FD
///...and even up to 64bit
x%4294967291==0; //x*0x70A3D70A33333333 <= 0x100000005
I checked hacker's delight "INTEGER DIVISION BY CONSTANTS" and it seems like this may be a special case of remainder by multiplication and right shift, but I'm not sure. There is a form on hacker's delight that calculates these same multiplier constants, so that seems promising. I'm guessing the magic comparison constant replaces the shift and compare to zero, but I am having trouble visualizing the 2s complement and whether the shift is arithmetic or logical.
Question
Is there some math behind this or were the numbers determined some other way using the binary representation?
Implications
Since this is simple integer multiply and comparison this could drastically speed up (or reduce memory footprint of) checking for prime using vector extensions/intrinsics. If the math could be extended beyond 64 bits, it could possibly make finding large Big-Number primes much faster?
Take 3, for instance.
0xAB * 3 = 0x201, thus, modulo 0x100, 0xAB is 1 / 3, and, inversely, 0xAB * 3 ≡ 1.
Any 8-bit unsigned integer n can be represented as n = 3*k + r, r < 3, and k is at most 0x55 (decimal 85, integral part of 255 / 3).
So we have options:
r = 0 ⇒ n * 0xAB = 3k * 0xAB = k * (3 * 0xAB) ≡ k * 1 = k ≤ 0x55.
r = 1 ⇒ n * 0xAB = 3k * 0xAB + 0xAB; since 3k * 0xAB is at most 0x55 (mod 0x100), adding it to 0xAB won't overflow, so 3k * 0xAB + 0xAB ≥ 0xAB > 0x55.
r = 2 ⇒ n * 0xAB = 3k * 0xAB + 0x156 ≡ 3k * 0xAB + 0x56 ≥ 0x56 > 0x55 (same as 2.)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With