I am optimizing a c++ code.
at one critical step, I want to implement the following function y=f(x)
:
f(0)=1
f(1)=2
f(2)=3
f(3)=0
which one is faster ? using a lookup table or i=(i+1)&3
or i=(i+1)%4
? or any better suggestion?
Almost certainly the lookup table is going to be slowest. In a lot of cases, the compiler will generate the same assembly for (i+1)&3
and (i+1)%4
; however depending on the type/signedness of i, they may not be strictly equivalent and the compiler won't be able to make that optimization. For example for the code
int foo(int i)
{
return (i+1)%4;
}
unsigned bar(unsigned i)
{
return (i+1)%4;
}
on my system, gcc -O2
generates:
0000000000000000 <foo>:
0: 8d 47 01 lea 0x1(%rdi),%eax
3: 89 c2 mov %eax,%edx
5: c1 fa 1f sar $0x1f,%edx
8: c1 ea 1e shr $0x1e,%edx
b: 01 d0 add %edx,%eax
d: 83 e0 03 and $0x3,%eax
10: 29 d0 sub %edx,%eax
12: c3 retq
0000000000000020 <bar>:
20: 8d 47 01 lea 0x1(%rdi),%eax
23: 83 e0 03 and $0x3,%eax
26: c3 retq
so as you can see because of the rules about signed modulus results, (i+1)%4
generates a lot more code in the first place.
Bottom line, you're probably best off using the (i+1)&3
version if that expresses what you want, because there's less chance for the compiler to do something you don't expect.
I won't get into the discussion of premature optimization. But the answer is that they will be the same speed.
Any sane compiler will compile them to the same thing. Division/modulus by a power of two will be optimized to bitwise operations anyway.
So use whichever you find (or others will find) to be more readable.
EDIT : As Roland has pointed out, it does sometimes behave different depending on the signness:
Unsigned &:
int main(void)
{
unsigned x;
cin >> x;
x = (x + 1) & 3;
cout << x;
return 0;
}
mov eax, DWORD PTR _x$[ebp]
inc eax
and eax, 3
push eax
Unsigned Modulus:
int main(void)
{
unsigned x;
cin >> x;
x = (x + 1) % 4;
cout << x;
return 0;
}
mov eax, DWORD PTR _x$[ebp]
inc eax
and eax, 3
push eax
Signed &:
int main(void)
{
int x;
cin >> x;
x = (x + 1) & 3;
cout << x;
return 0;
}
mov eax, DWORD PTR _x$[ebp]
inc eax
and eax, 3
push eax
Signed Modulus:
int main(void)
{
int x;
cin >> x;
x = (x + 1) % 4;
cout << x;
return 0;
}
mov eax, DWORD PTR _x$[ebp]
inc eax
and eax, -2147483645 ; 80000003H
jns SHORT $LN3@main
dec eax
or eax, -4 ; fffffffcH
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With