Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why doesn't gcc remove this check of a non-volatile variable?

This question is mostly academic. I ask out of curiosity, not because this poses an actual problem for me.

Consider the following incorrect C program.

#include <signal.h>
#include <stdio.h>

static int running = 1;

void handler(int u) {
    running = 0;
}

int main() {
    signal(SIGTERM, handler);
    while (running)
        ;
    printf("Bye!\n");
    return 0;
}

This program is incorrect because the handler interrupts the program flow, so running can be modified at any time and should therefore be declared volatile. But let's say the programmer forgot that.

gcc 4.3.3, with the -O3 flag, compiles the loop body (after one initial check of the running flag) down to the infinite loop

.L7:
        jmp     .L7

which was to be expected.

Now we put something trivial inside the while loop, like:

    while (running)
        putchar('.');

And suddenly, gcc does not optimize the loop condition anymore! The loop body's assembly now looks like this (again at -O3):

.L7:
        movq    stdout(%rip), %rsi
        movl    $46, %edi
        call    _IO_putc
        movl    running(%rip), %eax
        testl   %eax, %eax
        jne     .L7

We see that running is re-loaded from memory each time through the loop; it is not even cached in a register. Apparently gcc now thinks that the value of running could have changed.

So why does gcc suddenly decide that it needs to re-check the value of running in this case?

like image 767
Thomas Avatar asked Mar 25 '10 18:03

Thomas


People also ask

Does volatile prevent optimization?

The volatile keyword is intended to prevent the compiler from applying any optimizations on objects that can change in ways that cannot be determined by the compiler. Objects declared as volatile are omitted from optimization because their values can be changed by code outside the scope of current code at any time.

What does the compiler do for volatile?

Volatile is a qualifier that is applied to a variable when it is declared. It tells the compiler that the value of the variable may change at any time-without any action being taken by the code the compiler finds nearby.

What is the difference between volatile and non volatile variable in C?

The volatile memory stores data and computer programs that the CPU may need in real-time, and it erases them once a user switches off the computer. Cache memory and RAM are types of Volatile memory. Non-volatile memory, on the other hand, is static. It remains in a computer even after a user switches it off.

Are global variables volatile?

No, globals are not implicitly volatile. Your assumption that threads could communicate only through the use of global variables is incorrect.


1 Answers

In the general case it's difficult for a compiler to know exactly which objects a function might have access to and therefore could potentially modify. At the point where putchar() is called, GCC doesn't know if there might be a putchar() implementation that might be able to modify running so it has to be somewhat pessimistic and assume that running might in fact have been changed.

For example, there might be a putchar() implementation later in the translation unit:

int putchar( int c)
{
    running = c;
    return c;
}

Even if there's not a putchar() implementation in the translation unit, there could be something that might, for example, pass the address of the running object such that putchar might be able to modify it:

void foo(void)
{
    set_putchar_status_location( &running);
}

Note that your handler() function is globally accessible, so putchar() might call handler() itself (directly or otherwise), which is an instance of the above situation.

On the other hand, since running is visible only to the translational unit (being static), by the time the compiler gets to the end of the file it should be able to determine that there is no opportunity for putchar() to access it (assuming that's the case), and the compiler could go back and 'fix up' the pessimization in the while loop.

Since running is static, the compiler might be able to determine that it's not accessible from outside the translation unit and make the optimization you're talking about. However, since it's accessible through handler() and handler() is accessible externally, the compiler can't optimize the access away. Even if you make handler() static, it's accessible externally since you pass the address of it to another function.

Note that in your first example, even though what I mentioned in the above paragraph is still true the compiler can optimize away the access to running because the 'abstract machine model' the C language is based on doesn't take into account asynchronous activity except in very limited circumstances (one of which is the volatile keyword and another is signal handling, though the requirements of the signal handling aren't strong enough to prevent the compiler being able to optimize away the access to running in your first example).

In fact, here's something the C99 says about the abstract machine behavior in pretty much these exact circumstances:

5.1.2.3/8 "Program execution"

EXAMPLE 1:

An implementation might define a one-to-one correspondence between abstract and actual semantics: at every sequence point, the values of the actual objects would agree with those specified by the abstract semantics. The keyword volatile would then be redundant.

Alternatively, an implementation might perform various optimizations within each translation unit, such that the actual semantics would agree with the abstract semantics only when making function calls across translation unit boundaries. In such an implementation, at the time of each function entry and function return where the calling function and the called function are in different translation units, the values of all externally linked objects and of all objects accessible via pointers therein would agree with the abstract semantics. Furthermore, at the time of each such function entry the values of the parameters of the called function and of all objects accessible via pointers therein would agree with the abstract semantics. In this type of implementation, objects referred to by interrupt service routines activated by the signal function would require explicit specification of volatile storage, as well as other implementation defined restrictions.

Finally, you should note that the C99 standard also says:

7.14.1.1/5 "The signal function`

If the signal occurs other than as the result of calling the abort or raise function, the behavior is undefined if the signal handler refers to any object with static storage duration other than by assigning a value to an object declared as volatile sig_atomic_t...

So strictly speaking the running variable may need to be declared as:

volatile sig_atomic_t running = 1;
like image 147
Michael Burr Avatar answered Nov 15 '22 19:11

Michael Burr