Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Using volatile to prevent compiler optimization in benchmarking code?

Tags:

c++

I am creating a little program measure the performance difference between containers of types boost::shared_ptr and boost::intrusive_ptr. In order to prevent the compiler from optimizing away the copy I declare the variable as volatile. The loop looks like this:

// TestCopy measures the time required to create n copies of the given container.
// Returns time in milliseconds.
template<class Container>
time_t TestCopy(const Container & inContainer, std::size_t n) {
    Poco::Stopwatch stopwatch;
    stopwatch.start();
    for (std::size_t idx = 0; idx < n; ++idx)
    {
        volatile Container copy = inContainer; // Volatile!
    }

    // convert microseconds to milliseconds
    return static_cast<time_t>(0.5 + (double(stopwatch.elapsed()) / 1000.0));
}

The rest of the code can be found here: main.cpp.

  • Will using volatile here prevent the compiler from optimizing away the copy?
  • Are there any pitfalls that may invalidate the results?

Update

In response to @Neil Butterworth. Even when using the copy it still seems to me that the compiler could easily avoid the copy:

for (std::size_t idx = 0; idx < n; ++idx)
{
    // gcc won't remove this copy?
    Container copy = inContainer;
    gNumCopies += copy.size();        
}
like image 868
StackedCrooked Avatar asked May 25 '11 19:05

StackedCrooked


People also ask

How does volatile affect code optimization by compiler?

Effect of the volatile keyword on compiler optimizationIf you do not use the volatile keyword where it is needed, then the compiler might optimize accesses to the variable and generate unintended code or remove intended functionality.

How do I keep my compiler from optimizing variables?

Select the compiler. Under the Optimization category change optimization to Zero. When done debugging you can uncheck Override Build Options for the file. In the latter case the volatile defined inside the function can get optimized out quite often. ...

How do I stop GCC optimization?

The gcc option -O enables different levels of optimization. Use -O0 to disable them and use -S to output assembly.

What are the techniques under compile optimization?

Typical interprocedural optimizations are: procedure inlining, interprocedural dead-code elimination, interprocedural constant propagation, and procedure reordering. As usual, the compiler needs to perform interprocedural analysis before its actual optimizations.

Why should code optimization be controlled by a compile time flag?

Turning on optimization flags makes the compiler attempt to improve the performance and/or code size at the expense of compilation time and possibly the ability to debug the program.


2 Answers

The C++03 standard says that reads and writes to volatile data is observable behavior (C++ 2003, 1.9 [intro.execution] / 6). I believe this guarantees that assignment to volatile data cannot be optimized away. Another kind of observable behavior is calls to I/O functions. The C++11 standard is even more unambiguous in this regard: in 1.9/8 it explicitly says that

The least requirements on a conforming implementation are:
— Access to volatile objects are evaluated strictly according to the rules of the abstract machine.

If a compiler can prove that a code does not produce an observable behavior then it can optimize the code away. In your update (where volatile is not used), copy constructor and other function calls & overloaded operators might avoid any I/O calls and access to volatile data, and the compiler might well understand it. However if gNumCopies is a global variable that later used in an expression with observable behavior (e.g. printed), then this code will not be removed.

like image 84
Alexey Kukanov Avatar answered Sep 29 '22 12:09

Alexey Kukanov


Volatile is unlikely to do what you expect for a non-POD type. I would recommend passing a char * or void * aliasing the container to an empty function in a different translation unit. Since the compiler is unable to analyze the usage of the pointer, this will act as a compiler memory barrier, forcing the object out to the processor cache at least, and preventing most dead-value-elimination optimizations.

like image 45
bdonlan Avatar answered Sep 29 '22 10:09

bdonlan