Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why can't GCC optimize the logical bitwise AND pair in "x && (x & 4242)" to "x & 4242"?

Here are two functions which I claim do exactly the same thing:

bool fast(int x) {   return x & 4242; }  bool slow(int x) {   return x && (x & 4242); } 

Logically they do the same thing, and just to be 100% sure I wrote a test that ran all four billion possible inputs through both of them, and they matched. But the assembly code is a different story:

fast:     andl    $4242, %edi     setne   %al     ret  slow:     xorl    %eax, %eax     testl   %edi, %edi     je      .L3     andl    $4242, %edi     setne   %al .L3:     rep     ret 

I was surprised that GCC could not make the leap of logic to eliminate the redundant test. I tried g++ 4.4.3 and 4.7.2 with -O2, -O3, and -Os, all of which generated the same code. The platform is Linux x86_64.

Can someone explain why GCC shouldn't be smart enough to generate the same code in both cases? I'd also like to know if other compilers can do better.

Edit to add test harness:

#include <cstdlib> #include <vector> using namespace std;  int main(int argc, char* argv[]) {     // make vector filled with numbers starting from argv[1]     int seed = atoi(argv[1]);     vector<int> v(100000);     for (int j = 0; j < 100000; ++j)         v[j] = j + seed;      // count how many times the function returns true     int result = 0;     for (int j = 0; j < 100000; ++j)         for (int i : v)             result += slow(i); // or fast(i), try both      return result; } 

I tested the above with clang 5.1 on Mac OS with -O3. It took 2.9 seconds using fast() and 3.8 seconds using slow(). If I instead use a vector of all zeros, there is no significant difference in performance between the two functions.

like image 620
John Zwinck Avatar asked Apr 14 '14 08:04

John Zwinck


People also ask

How do I know if gcc is not optimized?

Compiler specific pragma gcc provides pragma GCC as a way to control temporarily the compiler behavior. By using pragma GCC optimize("O0") , the optimization level can be set to zero, which means absolutely no optimize for gcc.

How do I use optimization in gcc?

GCC has a range of optimization levels, plus individual options to enable or disable particular optimizations. The overall compiler optimization level is controlled by the command line option -On, where n is the required optimization level, as follows: -O0 . (default).

Is gcc an optimizing compiler?

GCC performs nearly all supported optimizations that do not involve a space-speed tradeoff. As compared to -O , this option increases both compilation time and the performance of the generated code.

What is gcc optimize?

The compiler optimizes to reduce the size of the binary instead of execution speed. If you do not specify an optimization option, gcc attempts to reduce the compilation time and to make debugging always yield the result expected from reading the source code.


1 Answers

Exactly why should it be able to optimize the code? You're assuming that any transformation that works will be done. That's not at all how optimizers work. They're not Artificial Intelligences. They simply work by parametrically replacing known patterns. E.g. the "Common Subexpression Elimination" scans an expression for common subexpressions, and moves them forwards, if that does not change side effects.

(BTW, CSE shows that optimizers are already quite aware of what code movement is allowed in the possible presence of side effects. They know that you have to be careful with &&. Whether expr && expr can be CSE-optimized or not depends on the side effects of expr.)

So, in summary: which pattern do you think applies here?

like image 57
MSalters Avatar answered Sep 16 '22 15:09

MSalters