Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why does the compiler not optimize this initialization?

Consider the following C code:

extern void foo(int* ip);

void myfunc(void)
{
    int arr[15] = {0};
    for (int i=0; i<10; i++)
    {
        arr[i] = 42;
    }

    foo(arr);
}

I tried with gcc and clang, with -O3 and -Os. In all cases the compiled assembly writes all 15 zeroes before overwriting 10 of them with 42.

I suppose it could just be that no optimization has been written for this case yet, but it seems like a fairly obvious and common case to me. Is there something that prevents the optimization?

I am on x86-32 Linux and used these commands:

gcc -std=c99 -S -O3 hello.c
clang -std=c99 -S -O3 hello.c
like image 560
Tor Klingberg Avatar asked Mar 11 '15 16:03

Tor Klingberg


People also ask

Does the Java compiler optimize?

The JVMs JIT compiler is one of the fascinating mechanisms on the Java platform. It optimizes your code for performance, without giving away its readability. Not only that, beyond the “static” optimization methods of inlining, it also makes decisions based on the way that the code performs in practice.

Do compilers Optimise code?

Compilers are free to optimize code so long as they can guarantee the semantics of the code are not changed.

Why is code optimization done by compilers but not by assemblers?

In any case optimization is done by compiler, which is language-dependant, not by the assembler. No because one instruction in assembly corresponds to one instruction in machine code.


1 Answers

It is not a very scientific explanation, but just an intuition (however, I do happen to know some of the internals of GCC).

To reliably do the optimization that you want, the compiler would have to manage sub-arrays or slices. Then it is becoming very complex and error-prone. A compiler optimizing that much is likely to consume a lot of memory (for the symbolic representations of sub-arrays) and a lot of compiler time. This is usually not worth the effort (which would be better spent inside the compiler to optimize loops).

BTW, GCC has a plugin framework and the MELT extension (MELT is a lispy domain specific language to extend GCC, and I am the main author of MELT). So you could try to add a new optimization pass (thru a MELT extension or some C++ plugin) doing the work. You'll soon realize that your pass would be either extraordinarily specific or would require handling a big lot of GCC internal representations, and is likely to blow up compilation time and memory for very few gains.

Notice that both GCC and Clang are cleverly unrolling the two loops (and that matters a lot performance-wise).

BTW, the Frama-C (a static analyzer for C programs developed by colleagues) value analyzer seems able to infer good properties about your arr

So, feel free to add that optimization to GCC. If you don't know (or don't have the time - many months or years) how to add it, feel free to pay a company or an organization able to enhance GCC for your needs. It is probably a million euro (or US dollars) / 3 years project to get that optimization working on interesting cases.

If you are serious about spending such amount of money, contact me by email.

A compiler which has such optimization would need some heuristics to disable them (e.g. if arr was a million-member array, and you was coding some sieve of Erasthothenes, it is probably not worth the compiler effort to keep all the unions of sub-slices of composite indexes at compile-time).

BTW, would you accept a twenty-times slower optimizing compiler (slower at compile time) for a gain (probably of a fraction of percent at run-time) which happens rarely in practice and is not very important? At last, I don't believe that this is a common case for optimizations. YMMV.

You might be perhaps interested by source to source transformers like PIPS4U

like image 158
Basile Starynkevitch Avatar answered Nov 16 '22 03:11

Basile Starynkevitch