Context I was asked the following puzzle by one of my friends: <pre class="prettyprint"><code>void fn(void) { /* write something after this comment so that the program output is 10 */ /* write something before this comment */ } int main() { int i = 5; fn(); printf("%d\n", i); return 0; } </code></pre> I know there can be multiple solutions, some involving macro and some assuming something about the implementation and violating C. One particular solution I was interested in is to make certain assumptions about stack and write following code: (I understand it is undefined behavior, but may work as expected on many implementations) <pre class="prettyprint"><code>void fn(void) { /* write something after this comment so that the program output is 10 */ int a[1] = {0}; int j = 0; while(a[j] != 5) ++j; /* Search stack until you find 5 */ a[j] = 10; /* Overwrite it with 10 */ /* write something before this comment */ } </code></pre> Problem This program worked fine in MSVC and gcc without optimization. But when I compiled it with <code>gcc -O2</code> flag or tried on ideone, it loops infinitely in function <code>fn</code>. My Observation When I compiled the file with <code>gcc -S</code> vs <code>gcc -S -O2</code> and compared, it clearly shows <code>gcc</code> kept an infinite loop in function <code>fn</code>. Question I understand because the code invokes undefined behavior, one can not call it a bug. But why and how does compiler analyze the behavior and leave an infinite loop at <code>O2</code>? <hr> Many people commented to know the behavior if some of the variables are changed to volatile. The result as expected is: <ul> <li>If <code>i</code> or <code>j</code> is changed to <code>volatile</code>, program behavior remains same.</li> <li>If array <code>a</code> is made <code>volatile</code>, program does not suffer infinite loop.</li> <li>Moreover if I apply the following patch</li> </ul> <pre class="prettyprint"><code>- int a[1] = {0}; + int aa[1] = {0}; + int *a = aa;</code></pre> The program behavior remains same (infinite loop) If I compile the code with <code>gcc -O2 -fdump-tree-optimized</code>, I get the following intermediate file: <pre class="prettyprint"><code>;; Function fn (fn) (executed once) Removing basic block 3 fn () { <bb 2>: <bb 3>: goto <bb 3>; } ;; Function main (main) (executed once) main () { <bb 2>: fn (); } Invalid sum of incoming frequencies 0, should be 10000 </code></pre> This verifies the assertions made after the answers below.

This is undefined behavior so the compiler can really do anything at all, we can find a similar example in GCC pre-4.8 Breaks Broken SPEC 2006 Benchmarks, where <code>gcc</code> takes a loop with undefined behavior and optimizes it to: <pre class="prettyprint"><code>L2: jmp .L2 </code></pre> The article says (emphasis mine): <blockquote> Of course this is an infinite loop. Since SATD() unconditionally executes undefined behavior (it’s a type 3 function), any translation (or none at all) is perfectly acceptable behavior for a correct C compiler. The undefined behavior is accessing d[16] just before exiting the loop. In C99 it is legal to create a pointer to an element one position past the end of the array, but that pointer must not be dereferenced. Similarly, the array cell one element past the end of the array must not be accessed. </blockquote> which if we examine your program with godbolt we see: <pre class="prettyprint"><code>fn: .L2: jmp .L2 </code></pre> The logic being used by the optimizer probably goes something like this: <ul> <li>All the elements of <code>a</code> are initialized to zero</li> <li> <code>a</code> is never modified before or within the loop</li> <li>So <code>a[j] != 5</code> is always true -> infinite loop</li> <li>Because of the infinite, the <code>a[j] = 10;</code> is unreachable and so that can be optimized away, so can <code>a</code> and <code>j</code> since they are no longer needed to determine the loop condition.</li> </ul> which is similar to the case in the article which given: <pre class="prettyprint"><code>int d[16]; </code></pre> analyzes the following loop: <pre class="prettyprint"><code>for (dd=d[k=0]; k<16; dd=d[++k]) </code></pre> like this: <blockquote> upon seeing d[++k], is permitted to assume that the incremented value of k is within the array bounds, since otherwise undefined behavior occurs. For the code here, GCC can infer that k is in the range 0..15. A bit later, when GCC sees k<16, it says to itself: “Aha– that expression is always true, so we have an infinite loop.” </blockquote> Perhaps an interesting secondary point, is whether an infinite loop is considered observable behavior(w.r.t. to the as-if rule) or not, which effects whether an infinite loop can also be optimized away. We can see from C Compilers Disprove Fermat’s Last Theorem that before C11 there was at least some room for interpretation: <blockquote> Many knowledgeable people (including me) read this as saying that the termination behavior of a program must not be changed. Obviously some compiler writers disagree, or else don’t believe that it matters. The fact that reasonable people disagree on the interpretation would seem to indicate that the C standard is flawed. </blockquote> C11 adds clarification to section <code>6.8.5</code> Iteration statements and is covered in more detail in this answer.

In the optimized version, the compiler has decided a few things: <ol> <li>The array <code>a</code> doesn't change before that test.</li> <li>The array <code>a</code> doesn't contain a <code>5</code>.</li> </ol> Therefore, we can rewrite the code as: <pre class="prettyprint"><code>void fn(void) { int a[1] = {0}; int j = 0; while(true) ++j; a[j] = 10; } </code></pre> Now, we can make further decisions: <ol> <li>All the code after the while loop is dead code (unreachable).</li> <li> <code>j</code> is written but never read. So we can get rid of it.</li> <li> <code>a</code> is never read.</li> </ol> At this point, your code has been reduced to: <pre class="prettyprint"><code>void fn(void) { int a[1] = {0}; while(true); } </code></pre> And we can make the note that <code>a</code> is now never read, so let's get rid of it as well: <pre class="prettyprint"><code>void fn(void) { while(true); } </code></pre> <h3>Now, the unoptimized code:</h3> In unoptimized generated code, the array will remain in memory. And you'll literally walk it at runtime. And it's possible that there will be a <code>5</code> thats readable after it once you walk past the end of the array. Which is why the unoptimized version sometimes doesn't crash and burn.

Function optimized to infinite loop at 'gcc -O2'

Tags:

c

optimization

gcc

undefined-behavior

Context
I was asked the following puzzle by one of my friends:

void fn(void) {   /* write something after this comment so that the program output is 10 */   /* write something before this comment */ }  int main() {   int i = 5;   fn();   printf("%d\n", i);   return 0; }

I know there can be multiple solutions, some involving macro and some assuming something about the implementation and violating C.

One particular solution I was interested in is to make certain assumptions about stack and write following code: (I understand it is undefined behavior, but may work as expected on many implementations)

void fn(void) {   /* write something after this comment so that the program output is 10 */   int a[1] = {0};   int j = 0;   while(a[j] != 5) ++j;  /* Search stack until you find 5 */   a[j] = 10;             /* Overwrite it with 10 */   /* write something before this comment */ }

Problem
This program worked fine in MSVC and gcc without optimization. But when I compiled it with gcc -O2 flag or tried on ideone, it loops infinitely in function fn.

My Observation
When I compiled the file with gcc -S vs gcc -S -O2 and compared, it clearly shows gcc kept an infinite loop in function fn.

Question
I understand because the code invokes undefined behavior, one can not call it a bug. But why and how does compiler analyze the behavior and leave an infinite loop at O2?

Many people commented to know the behavior if some of the variables are changed to volatile. The result as expected is:

If i or j is changed to volatile, program behavior remains same.
If array a is made volatile, program does not suffer infinite loop.
Moreover if I apply the following patch

-  int a[1] = {0}; +  int aa[1] = {0}; +  int *a = aa;

The program behavior remains same (infinite loop)

If I compile the code with gcc -O2 -fdump-tree-optimized, I get the following intermediate file:

;; Function fn (fn) (executed once)  Removing basic block 3 fn () { <bb 2>:  <bb 3>:   goto <bb 3>;  }    ;; Function main (main) (executed once)  main () { <bb 2>:   fn ();  } Invalid sum of incoming frequencies 0, should be 10000

This verifies the assertions made after the answers below.

764

asked Feb 20 '15 14:02

Mohit Jain

2 Answers

This is undefined behavior so the compiler can really do anything at all, we can find a similar example in GCC pre-4.8 Breaks Broken SPEC 2006 Benchmarks, where gcc takes a loop with undefined behavior and optimizes it to:

L2:     jmp .L2

The article says (emphasis mine):

Of course this is an infinite loop. Since SATD() unconditionally executes undefined behavior (it’s a type 3 function), any translation (or none at all) is perfectly acceptable behavior for a correct C compiler. The undefined behavior is accessing d[16] just before exiting the loop. In C99 it is legal to create a pointer to an element one position past the end of the array, but that pointer must not be dereferenced. Similarly, the array cell one element past the end of the array must not be accessed.

which if we examine your program with godbolt we see:

fn: .L2:     jmp .L2

The logic being used by the optimizer probably goes something like this:

All the elements of a are initialized to zero
a is never modified before or within the loop
So a[j] != 5 is always true -> infinite loop
Because of the infinite, the a[j] = 10; is unreachable and so that can be optimized away, so can a and j since they are no longer needed to determine the loop condition.

which is similar to the case in the article which given:

int d[16];

analyzes the following loop:

for (dd=d[k=0]; k<16; dd=d[++k])

like this:

upon seeing d[++k], is permitted to assume that the incremented value of k is within the array bounds, since otherwise undefined behavior occurs. For the code here, GCC can infer that k is in the range 0..15. A bit later, when GCC sees k<16, it says to itself: “Aha– that expression is always true, so we have an infinite loop.”

Perhaps an interesting secondary point, is whether an infinite loop is considered observable behavior(w.r.t. to the as-if rule) or not, which effects whether an infinite loop can also be optimized away. We can see from C Compilers Disprove Fermat’s Last Theorem that before C11 there was at least some room for interpretation:

Many knowledgeable people (including me) read this as saying that the termination behavior of a program must not be changed. Obviously some compiler writers disagree, or else don’t believe that it matters. The fact that reasonable people disagree on the interpretation would seem to indicate that the C standard is flawed.

C11 adds clarification to section 6.8.5 Iteration statements and is covered in more detail in this answer.

answered Nov 08 '22 13:11

Shafik Yaghmour

In the optimized version, the compiler has decided a few things:

The array a doesn't change before that test.
The array a doesn't contain a 5.

Therefore, we can rewrite the code as:

void fn(void) {   int a[1] = {0};   int j = 0;   while(true) ++j;   a[j] = 10; }

Now, we can make further decisions:

All the code after the while loop is dead code (unreachable).
j is written but never read. So we can get rid of it.
a is never read.

At this point, your code has been reduced to:

void fn(void) {   int a[1] = {0};   while(true); }

And we can make the note that a is now never read, so let's get rid of it as well:

void fn(void) {   while(true); }

Now, the unoptimized code:

In unoptimized generated code, the array will remain in memory. And you'll literally walk it at runtime. And it's possible that there will be a 5 thats readable after it once you walk past the end of the array.

Which is why the unoptimized version sometimes doesn't crash and burn.

answered Nov 08 '22 13:11

Bill Lynch

Related questions
                            
                                Pointers and access to memory in c. Be careful [duplicate]
                            
                                Best ways of parsing a URL using C?
                            
                                Accessing arrays by index[array] in C and C++
                            
                                How to install gtk development dependencies on Ubuntu?
                            
                                How to load multiple symbol files in gdb
                            
                                How to pass a constant array literal to a function that takes a pointer without using a variable C/C++?
                            
                                What is the purpose of the _chkstk() function?
                            
                                How does sig_atomic_t actually work?
                            
                                Why is int x[n] wrong where n is a const value?
                            
                                Is the strrev() function not available in Linux?
                            
                                how to debug application as root in eclipse in Ubuntu?
                            
                                Why does this implementation of offsetof() work?
                            
                                Why do we need a Unit Vector (in other words, why do we need to normalize vectors)?
                            
                                FFT in a single C-file [closed]
                            
                                GCC: Difference between -O3 and -Os
                            
                                Socketpair() in C/Unix
                            
                                n & (n-1) what does this expression do? [duplicate]
                            
                                What do \t and \b do?
                            
                                /usr/bin/ld: cannot find -lc while compiling with makefile
                            
                                "inline" keyword vs "inlining" concept

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With