Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

for loop being ignored (optimized?) out

I am using for/while loops for implementing a delay in my code. The duration of the delay is unimportant here though it is sufficiently large to be noticeable. Here is the code snippet.

uint32_t i;

// Do something useful

for (i = 0; i < 50000000U; ++i)
{}

// Do something useful

The issue I am observing is that this for loop won't get executed. It probably gets ignored/optimized by the compiler. However, if I qualify the loop counter i by volatile, the for loop seems to execute and I do notice the desired delay in the execution.

This behavior seems a bit counter-intuitive to my understanding of the compiler optimizations with/without the volatile keyword.

Even if the loop counter is getting optimized and being stored in the processor register, shouldn't the counter still work, perhaps with a lesser delay? (Since the memory fetch overhead is done away with.)

The platform I am building for is Xtensa processor (by Tensilica), and the C compiler is the one provided by Tensilica, Xtensa C/C++ compiler running with highest level of optimizations.

I tried the same with gcc 4.4.7 with -o3 and ofast optimization levels. The delay seems to work in that case.

like image 798
LoneWolf Avatar asked May 13 '15 07:05

LoneWolf


1 Answers

This is all about observable behavior. The only observable behavior of your loop is that i is 50000000U after the loop. The compiler is allowed to optimize it and replace it by i = 50000000U;. This i assignment will also be optimized out because the value of i have no observable consequences.

The volatile keyword tells the compiler that writing to and reading from i have an observable behavior, thus preventing it from optimizing.

The compiler will also not optimize calls to function where it doesn't have access to the code. Theoretically, if a compiler had access to the whole OS code, it could optimize everything but the volatile variables, which are often put on hardware IO operations.

These optimization rules all conform to what is written in the C standard (cf. comments for references).

Also, if you want a delay, use a specialized function (ex: OS API), they are reliable and don't consume CPU, unlike a spin-delay like yours.

like image 90
ElderBug Avatar answered Oct 22 '22 04:10

ElderBug