Why can't (or doesn't) the compiler optimize a predictable addition loop into a multiplication?

People also ask

What factors are considered during the optimization of compiler?

In computing, an optimizing compiler is a compiler that tries to minimize or maximize some attributes of an executable computer program. Common requirements are to minimize a program's execution time, memory footprint, storage size, and power consumption (the last three being popular for portable computers).

Which property is the most important for an optimizing compiler?

Which property is the most important for an optimizing compiler? Layers in the cache hierarchy that are closer to the CPU are than layers that are farther from the CPU.

Does the Java compiler optimize?

The compiler don't optimize the bytecode because it is optimized at run time by the JIT optimizer. If the type of runtime you are targeting don't have a JIT optimizer (even if it had a JIT compiler), or you are AOT compiling, I recommend using an optimizing obfuscator like Proguard or Allatori.

Does compiler optimize code?

Compilers are free to optimize code so long as they can guarantee the semantics of the code are not changed. I would suggestion starting at the Compiler optimization wikipedia page as there are many different kinds of optimization that are performed at many different stages.

The compiler can't generally transform

for (int c = 0; c < arraySize; ++c)
    if (data[c] >= 128)
        for (int i = 0; i < 100000; ++i)
            sum += data[c];

into

for (int c = 0; c < arraySize; ++c)
    if (data[c] >= 128)
        sum += 100000 * data[c];

because the latter could lead to overflow of signed integers where the former doesn't. Even with guaranteed wrap-around behaviour for overflow of signed two's complement integers, it would change the result (if data[c] is 30000, the product would become -1294967296 for the typical 32-bit ints with wrap around, while 100000 times adding 30000 to sum would, if that doesn't overflow, increase sum by 3000000000). Note that the same holds for unsigned quantities, with different numbers, overflow of 100000 * data[c] would typically introduce a reduction modulo 2^32 that must not appear in the final result.

It could transform it into

for (int c = 0; c < arraySize; ++c)
    if (data[c] >= 128)
        sum += 100000LL * data[c];  // resp. 100000ull

though, if, as usual, long long is sufficiently larger than int.

Why it doesn't do that, I can't tell, I guess it's what Mysticial said, "apparently, it does not run a loop-collapsing pass after loop-interchange".

Note that the loop-interchange itself is not generally valid (for signed integers), since

for (int c = 0; c < arraySize; ++c)
    if (condition(data[c]))
        for (int i = 0; i < 100000; ++i)
            sum += data[c];

can lead to overflow where

for (int i = 0; i < 100000; ++i)
    for (int c = 0; c < arraySize; ++c)
        if (condition(data[c]))
            sum += data[c];

wouldn't. It's kosher here, since the condition ensures all data[c] that are added have the same sign, so if one overflows, both do.

I wouldn't be too sure that the compiler took that into account, though (@Mysticial, could you try with a condition like data[c] & 0x80 or so that can be true for positive and negative values?). I had compilers make invalid optimisations (for example, a couple of years ago, I had an ICC (11.0, iirc) use signed-32-bit-int-to-double conversion in 1.0/n where n was an unsigned int. Was about twice as fast as gcc's output. But wrong, a lot of values were larger than 2^31, oops.).

This answer does not apply to the specific case linked, but it does apply to the question title and may be interesting to future readers:

Due to finite precision, repeated floating-point addition is not equivalent to multiplication. Consider:

float const step = 1e-15;
float const init = 1;
long int const count = 1000000000;

float result1 = init;
for( int i = 0; i < count; ++i ) result1 += step;

float result2 = init;
result2 += step * count;

cout << (result1 - result2);

Demo

The compiler contains various passes which does the optimization. Usually in each pass either an optimization on statements or loop optimizations are done. At present there is no model which does an optimization of loop body based on the loop headers. This is hard to detect and less common.

The optimization which was done was loop invariant code motion. This can be done using a set of techniques.

Well, I'd guess that some compilers might do this sort of optimization, assuming that we are talking about Integer Arithmetics.

At the same time, some compilers might refuse to do it because replacing repetitive addition with multiplication might change the overflow behavior of the code. For unsigned integer types, it shouldn't make a difference since their overflow behavior is fully specified by the language. But for signed ones, it might (probably not on 2's complement platform though). It is true that signed overflow actually leads to undefined behavior in C, meaning that it should be perfectly OK to ignore that overflow semantics altogether, but not all compilers are brave enough to do that. It often draws a lot of criticism from the "C is just a higher-level assembly language" crowd. (Remember what happened when GCC introduced optimizations based on strict-aliasing semantics?)

Historically, GCC has shown itself as a compiler that has what it takes to take such drastic steps, but other compilers might prefer to stick with the perceived "user-intended" behavior even if it is undefined by the language.

Related questions
                            
                                Create a pointer to two-dimensional array
                            
                                Why does Clang optimize away x * 1.0 but NOT x + 0.0?
                            
                                What is uint_fast32_t and why should it be used instead of the regular int and uint32_t?
                            
                                Why does sizeof(my_arr)[0] compile and equal sizeof(my_arr[0])?
                            
                                Creating a daemon in Linux
                            
                                How does the below program output `C89` when compiled in C89 mode and `C99` when compiled in C99 mode?
                            
                                Difference between CC, gcc and g++?
                            
                                What's the meaning of exception code "EXC_I386_GPFLT"?
                            
                                What's the difference between a file descriptor and file pointer?
                            
                                Why is a simple loop optimized when the limit is 959 but not 960?
                            
                                Possible GCC bug when returning struct from a function
                            
                                Where is PATH_MAX defined in Linux?
                            
                                Passing command line arguments in Visual Studio 2010?
                            
                                What is the difference between memmove and memcpy?
                            
                                const char* concatenation
                            
                                FFmpeg C API documentation/tutorial [closed]
                            
                                In release mode, code behavior is not as expected
                            
                                Concept of void pointer in C programming
                            
                                Why declare a struct that only contains an array in C?
                            
                                Why doesn't c = a+++++b work in C? [duplicate]

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Why can't (or doesn't) the compiler optimize a predictable addition loop into a multiplication?

Tags:

performance

c

compiler-optimization

People also ask

Recent Activity

Donate For Us