Benchmarking using and instruction reordering

Question

I've been using, up until now, the traditional way to benchmark concurrent methods, which is to measure the elapsed duration for a number of runs:

template <typename Functor>
double benchmark(Functor const& f, size_t nbRuns)
{
  if (nbRuns == 0) { return 0.0; }

  f(); // Initialize before measuring, I am not interesting in setup cost

  time_t begin = time(0);
  for (size_t i = 0; i != nbRuns; ++i) { f(); }
  time_t end = time(0);

  return difftime(end, begin);
}

which seemed all fine and dandy until I came upon this question: Optimizing away a "while(1);" loop in C++0x.

What strikes me as unusual is that the compiler is allowed to execute the output BEFORE the loop... and I am suddenly wondering:

What prevents the compiler from executing time_t end = time(0); before the loop here ?

because if it did, that would somehow screw my little benchmark code.

And while we are at it, if ever the reordering could occur in this situation:

How can one prevent it ?

I could not think of relevant tags apart from the C++ ones, if anyone think I've missed one, feel free to add it

jpalecek · Accepted Answer

This is a tricky question.

What prevents the compiler from executing time_t end = time(0); before the loop here ?

Generally, nothing; in fact, even in C++03. Because of the as-if rule, the compiler may emit any code which has the same observable behaviour. That means, if omitting f() doesn't change any specified input/output, or volatiles access, it may not run f() at all.

What strikes me as unusual is that the compiler is allowed to execute the output BEFORE the loop

That's not really true - the issue with the empty loop is that C++0x doesn't count mere nontermination as observable behavior. It's not that it can reorder empty loop and the output of "Hello", it's rather that the compiler can leave out the empty loop altogether.

CashCow · Answer

Normally I would put my timer into a scope using an object so it calculates the "end" in its destructor when it goes out of scope.

Would the compiler be allowed to execute its destructor whilst still in the scope? I don't know.

Of course time_t only measures seconds so I would normally measure a finer grain, usually milliseconds. Sometimes milliseconds is not granular enough (e.g. very small functions that are called lots of times) in which case you would probably use microseconds.

Of course in this case there is an overhead in entering and leaving the scope itself, but it is often a good measure in an "intrusive" profiling which is often very good for optimising in real cases. (You can often switch the feature on and off).

Benchmarking using <ctime> and instruction reordering

Tags:

c++

c++11

Matthieu M.

2 Answers

jpalecek

CashCow

Recent Activity

Donate For Us