I am creating a little program measure the performance difference between containers of types boost::shared_ptr
and boost::intrusive_ptr
. In order to prevent the compiler from optimizing away the copy I declare the variable as volatile. The loop looks like this:
// TestCopy measures the time required to create n copies of the given container.
// Returns time in milliseconds.
template<class Container>
time_t TestCopy(const Container & inContainer, std::size_t n) {
Poco::Stopwatch stopwatch;
stopwatch.start();
for (std::size_t idx = 0; idx < n; ++idx)
{
volatile Container copy = inContainer; // Volatile!
}
// convert microseconds to milliseconds
return static_cast<time_t>(0.5 + (double(stopwatch.elapsed()) / 1000.0));
}
The rest of the code can be found here: main.cpp.
In response to @Neil Butterworth. Even when using the copy it still seems to me that the compiler could easily avoid the copy:
for (std::size_t idx = 0; idx < n; ++idx)
{
// gcc won't remove this copy?
Container copy = inContainer;
gNumCopies += copy.size();
}
Effect of the volatile keyword on compiler optimizationIf you do not use the volatile keyword where it is needed, then the compiler might optimize accesses to the variable and generate unintended code or remove intended functionality.
Select the compiler. Under the Optimization category change optimization to Zero. When done debugging you can uncheck Override Build Options for the file. In the latter case the volatile defined inside the function can get optimized out quite often. ...
The gcc option -O enables different levels of optimization. Use -O0 to disable them and use -S to output assembly.
Typical interprocedural optimizations are: procedure inlining, interprocedural dead-code elimination, interprocedural constant propagation, and procedure reordering. As usual, the compiler needs to perform interprocedural analysis before its actual optimizations.
Turning on optimization flags makes the compiler attempt to improve the performance and/or code size at the expense of compilation time and possibly the ability to debug the program.
The C++03 standard says that reads and writes to volatile data is observable behavior (C++ 2003, 1.9 [intro.execution] / 6). I believe this guarantees that assignment to volatile data cannot be optimized away. Another kind of observable behavior is calls to I/O functions. The C++11 standard is even more unambiguous in this regard: in 1.9/8 it explicitly says that
The least requirements on a conforming implementation are:
— Access to volatile objects are evaluated strictly according to the rules of the abstract machine.
If a compiler can prove that a code does not produce an observable behavior then it can optimize the code away. In your update (where volatile is not used), copy constructor and other function calls & overloaded operators might avoid any I/O calls and access to volatile data, and the compiler might well understand it. However if gNumCopies
is a global variable that later used in an expression with observable behavior (e.g. printed), then this code will not be removed.
Volatile is unlikely to do what you expect for a non-POD type. I would recommend passing a char *
or void *
aliasing the container to an empty function in a different translation unit. Since the compiler is unable to analyze the usage of the pointer, this will act as a compiler memory barrier, forcing the object out to the processor cache at least, and preventing most dead-value-elimination optimizations.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With