There is a (relatively) well known hack for dividing a 32-bit number by three. Instead of using actual expensive division, the number can be multiplied by the magic number 0x55555556
, and the upper 32 bits of the result are what we're looking for. For example, the following C code:
int32_t div3(int32_t x)
{
return x / 3;
}
compiled with GCC and -O2
, results in this:
08048460 <div3>:
8048460: 8b 4c 24 04 mov ecx,DWORD PTR [esp+0x4]
8048464: ba 56 55 55 55 mov edx,0x55555556
8048469: 89 c8 mov eax,ecx
804846b: c1 f9 1f sar ecx,0x1f
804846e: f7 ea imul edx
8048470: 89 d0 mov eax,edx
8048472: 29 c8 sub eax,ecx
8048474: c3 ret
I'm guessing the sub
instruction is responsible for fixing negative numbers, because what it does is essentially add 1 if the argument is negative, and it's a NOP
otherwise.
But why does this work? I've been trying to manually multiply smaller numbers by a 1-byte version of this mask, but I fail to see a pattern, and I can't really find any explanations anywhere. It seems to be a mystery magic number whose origin isn't clear to anyone, just like 0x5f3759df.
Can someone provide an explanation of the arithmetic behind this?
To divide a number by 3 using repeated subtraction, subtract 3 from it over and over again, till you reach 0. The number of times you subtract is the answer to the division problem.
Basically count the number of non-zero odd positions bits and non-zero even position bits from the right. If their difference is divisible by 3, then the number is divisible by 3. For example: 15 = 1111 which has 2 odd and 2 even non-zero bits.
Approach: Keep subtracting the divisor from the dividend until the dividend becomes less than the divisor. The dividend becomes the remainder, and the number of times subtraction is done becomes the quotient.
It's because 0x55555556
is really 0x100000000 / 3
, rounded up.
The rounding is important. Since 0x100000000
doesn't divide evenly by 3, there will be an error in the full 64-bit result. If that error were negative, the result after truncation of the lower 32 bits would be too low. By rounding up, the error is positive, and it's all in the lower 32 bits so the truncation wipes it out.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With