Math behind gcc9+ modulus optimizations

Question

Background

I was playing with prime numbers in c, when I stumbled upon a new optimization in gcc trunk (will be version 9.x) that optimizes a modulus comparison to 0 into an integer multiply and comparison using magic numbers. In other words x%prime==0 becomes x*Magic_mul<=Magic_cmp

_Bool mod(unsigned x){return x % Constant == 0;}

mod:
  imul edi, edi, Magic_mul
  cmp edi, Magic_cmp
  setbe al

Details

Based on seeing the asm output, it does these optimizations for all integers (well, primes at least) I converted them to hex to help see the patterns, but its not immediately obvious at the moment.

//32bit examples for _Bool mod_n(unsigned x){return x%n==0;};
//note: parameter is unsigned but it becomes a signed multiply
x%3==0;  // x*0xAAAAAAAB <= 0x55555555
x%5==0;  // x*0xCCCCCCCD <= 0x33333333
x%7==0;  // x*0xB6DB6DB7 <= 0x24924924
x%11==0; // x*0xBA2E8BA3 <= 0x1745D174
x%13==0; // x*0xC4EC4EC5 <= 0x13B13B13
x%17==0; // x*0xF0F0F0F1 <= 0x0F0F0F0F
x%19==0; // x*0x286BCA1B <= 0x0D79435E
x%23==0; // x*0xE9BD37A7 <= 0x0B21642C
x%29==0; // x*0x4F72C235 <= 0x08D3DCB0
x%31==0; // x*0xBDEF7BDF <= 0x08421084
x%37==0; // x*0x914C1BAD <= 0x06EB3E45
x%41==0; // x*0xC18F9C19 <= 0x063E7063
x%43==0; // x*0x2FA0BE83 <= 0x05F417D0
x%47==0; // x*0x677D46CF <= 0x0572620A
x%53==0; // x*0x8C13521D <= 0x04D4873E
x%59==0; // x*0xA08AD8F3 <= 0x0456C797
x%61==0; // x*0xC10C9715 <= 0x04325C53
x%67==0; // x*0x07A44C6B <= 0x03D22635
x%71==0; // x*0xE327A977 <= 0x039B0AD1
x%73==0; // x*0xC7E3F1F9 <= 0x0381C0E0
x%79==0; // x*0x613716AF <= 0x033D91D2
x%83==0; // x*0x2B2E43DB <= 0x03159721
x%89==0; // x*0xFA3F47E9 <= 0x02E05C0B
x%97==0; // x*0x5F02A3A1 <= 0x02A3A0FD
///...and even up to 64bit
x%4294967291==0; //x*0x70A3D70A33333333 <= 0x100000005

I checked hacker's delight "INTEGER DIVISION BY CONSTANTS" and it seems like this may be a special case of remainder by multiplication and right shift, but I'm not sure. There is a form on hacker's delight that calculates these same multiplier constants, so that seems promising. I'm guessing the magic comparison constant replaces the shift and compare to zero, but I am having trouble visualizing the 2s complement and whether the shift is arithmetic or logical.

Question

Is there some math behind this or were the numbers determined some other way using the binary representation?

Implications

Since this is simple integer multiply and comparison this could drastically speed up (or reduce memory footprint of) checking for prime using vector extensions/intrinsics. If the math could be extended beyond 64 bits, it could possibly make finding large Big-Number primes much faster?

bipll · Accepted Answer

Take 3, for instance.

0xAB * 3 = 0x201, thus, modulo 0x100, 0xAB is 1 / 3, and, inversely, 0xAB * 3 ≡ 1.

Any 8-bit unsigned integer n can be represented as n = 3*k + r, r < 3, and k is at most 0x55 (decimal 85, integral part of 255 / 3).

So we have options:

r = 0 ⇒ n * 0xAB = 3k * 0xAB = k * (3 * 0xAB) ≡ k * 1 = k ≤ 0x55.
r = 1 ⇒ n * 0xAB = 3k * 0xAB + 0xAB; since 3k * 0xAB is at most 0x55 (mod 0x100), adding it to 0xAB won't overflow, so 3k * 0xAB + 0xAB ≥ 0xAB > 0x55.
r = 2 ⇒ n * 0xAB = 3k * 0xAB + 0x156 ≡ 3k * 0xAB + 0x56 ≥ 0x56 > 0x55 (same as 2.)

Math behind gcc9+ modulus optimizations

Tags:

c

math

compiler-optimization

modulus

technosaurus

1 Answers

bipll

Recent Activity

Donate For Us

Math behind gcc9+ modulus optimizations

Tags:

c

math

compiler-optimization

modulus

technosaurus

1 Answers

bipll

Related questions

Recent Activity

Donate For Us