Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Benchmarking using <ctime> and instruction reordering

Tags:

c++

c++11

I've been using, up until now, the traditional way to benchmark concurrent methods, which is to measure the elapsed duration for a number of runs:

template <typename Functor>
double benchmark(Functor const& f, size_t nbRuns)
{
  if (nbRuns == 0) { return 0.0; }

  f(); // Initialize before measuring, I am not interesting in setup cost

  time_t begin = time(0);
  for (size_t i = 0; i != nbRuns; ++i) { f(); }
  time_t end = time(0);

  return difftime(end, begin);
}

which seemed all fine and dandy until I came upon this question: Optimizing away a "while(1);" loop in C++0x.

What strikes me as unusual is that the compiler is allowed to execute the output BEFORE the loop... and I am suddenly wondering:

What prevents the compiler from executing time_t end = time(0); before the loop here ?

because if it did, that would somehow screw my little benchmark code.

And while we are at it, if ever the reordering could occur in this situation:

How can one prevent it ?

I could not think of relevant tags apart from the C++ ones, if anyone think I've missed one, feel free to add it

like image 807
Matthieu M. Avatar asked Nov 22 '10 16:11

Matthieu M.


2 Answers

This is a tricky question.

What prevents the compiler from executing time_t end = time(0); before the loop here ?

Generally, nothing; in fact, even in C++03. Because of the as-if rule, the compiler may emit any code which has the same observable behaviour. That means, if omitting f() doesn't change any specified input/output, or volatiles access, it may not run f() at all.

What strikes me as unusual is that the compiler is allowed to execute the output BEFORE the loop

That's not really true - the issue with the empty loop is that C++0x doesn't count mere nontermination as observable behavior. It's not that it can reorder empty loop and the output of "Hello", it's rather that the compiler can leave out the empty loop altogether.

like image 92
jpalecek Avatar answered Sep 21 '22 13:09

jpalecek


Normally I would put my timer into a scope using an object so it calculates the "end" in its destructor when it goes out of scope.

Would the compiler be allowed to execute its destructor whilst still in the scope? I don't know.

Of course time_t only measures seconds so I would normally measure a finer grain, usually milliseconds. Sometimes milliseconds is not granular enough (e.g. very small functions that are called lots of times) in which case you would probably use microseconds.

Of course in this case there is an overhead in entering and leaving the scope itself, but it is often a good measure in an "intrusive" profiling which is often very good for optimising in real cases. (You can often switch the feature on and off).

like image 38
CashCow Avatar answered Sep 20 '22 13:09

CashCow