I saw the chosen answer to this post.
I was suprised that (x & 255) == (x % 256)
if x is an unsigned integer, I wondered if it makes sense to always replace %
with &
in x % n
for n = 2^a (a = [1, ...])
and x being a positive integer.
Since this is a special case in which I as a human can decide because I know with which values the program will deal with and the compiler does not. Can I gain a significant performance boost if my program uses a lot of modulo operations?
Sure, I could just compile and look at the dissassembly. But this would only answer my question for one compiler/architecture. I would like to know if this is in principle faster.
Special case examples include the following: All squares are rectangles (but not all rectangles are squares); therefore the square is a special case of the rectangle.
to argue that something is the best thing to do, giving your reasons: We will only publish a new edition if you can make a convincing case for it. SMART Vocabulary: related words and phrases. Arguing & disagreeing.
If your integral type is unsigned, the compiler will optimize it, and the result will be the same. If it's signed, something is different...
This program:
int mod_signed(int i) { return i % 256; } int and_signed(int i) { return i & 255; } unsigned mod_unsigned(unsigned int i) { return i % 256; } unsigned and_unsigned(unsigned int i) { return i & 255; }
will be compiled (by GCC 6.2 with -O3; Clang 3.9 produces very similar code) into:
mod_signed(int): mov edx, edi sar edx, 31 shr edx, 24 lea eax, [rdi+rdx] movzx eax, al sub eax, edx ret and_signed(int): movzx eax, dil ret mod_unsigned(unsigned int): movzx eax, dil ret and_unsigned(unsigned int): movzx eax, dil ret
The result assembly of mod_signed
is different because
If both operands to a multiplication, division, or modulus expression have the same sign, the result is positive. Otherwise, the result is negative. The result of a modulus operation's sign is implementation-defined.
and AFAICT, most of implementation decided that the result of a modulus expression is always the same as the sign of the first operand. See this documentation.
Hence, mod_signed
is optimized to (from nwellnhof's comment):
int d = i < 0 ? 255 : 0; return ((i + d) & 255) - d;
Logically, we can prove that i % 256 == i & 255
for all unsigned integers, hence, we can trust the compiler to do its job.
I did some measurements with gcc, and if the argument of a /
or %
is a compiled time constant that's a power of 2, gcc can turn it into the corresponding bit operation.
Here are some of my benchmarks for divisions What has a better performance: multiplication or division? and as you can see, the running times with divisors that are statically known powers of two are noticably lower than with other statically known divisors.
So if /
and %
with statically known power-of-two arguments describe your algorithm better than bit ops, feel free to prefer /
and %
. You shouldn't lose any performance with a decent compiler.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With