I am interested in timing the execution time of a free function or a member function (template or not). Call TheFunc the function in question, its call being
TheFunc(/*parameters*/);
or
ReturnType ret = TheFunc(/*parameters*/);
Of course I could wrap these function calls as follows :
double duration = 0.0 ;
std::clock_t start = std::clock();
TheFunc(/*parameters*/);
duration = static_cast<double>(std::clock() - start) / static_cast<double>(CLOCKS_PER_SEC);
or
double duration = 0.0 ;
std::clock_t start = std::clock();
ReturnType ret = TheFunc(/*parameters*/);
duration = static_cast<double>(std::clock() - start) / static_cast<double>(CLOCKS_PER_SEC);
but I would like to do something more elegant than this, namely (and from now on I will stick to the void return type) as follows :
Timer thetimer ;
double duration = 0.0;
thetimer(*TheFunc)(/*parameters*/, duration);
where Timer is some timing class that I would like to design and that would allow me to write the previous code, in such way that after the exectution of the last line of previous code the double duration will contain the execution time of
TheFunc(/*parameters*/);
but I don't see how to do this, nor if the syntax/solution I aim for is optimal...
With variadic template, you may do:
template <typename F, typename ... Ts>
double Time_function(F&& f, Ts&&...args)
{
std::clock_t start = std::clock();
std::forward<F>(f)(std::forward<Ts>(args)...);
return static_cast<double>(std::clock() - start) / static_cast<double>(CLOCKS_PER_SEC);
}
I really like boost::cpu_timer::auto_cpu_timer
, and when I cannot use boost I simply hack my own:
#include <cmath>
#include <string>
#include <chrono>
#include <iostream>
class AutoProfiler {
public:
AutoProfiler(std::string name)
: m_name(std::move(name)),
m_beg(std::chrono::high_resolution_clock::now()) { }
~AutoProfiler() {
auto end = std::chrono::high_resolution_clock::now();
auto dur = std::chrono::duration_cast<std::chrono::microseconds>(end - m_beg);
std::cout << m_name << " : " << dur.count() << " musec\n";
}
private:
std::string m_name;
std::chrono::time_point<std::chrono::high_resolution_clock> m_beg;
};
void foo(std::size_t N) {
long double x {1.234e5};
for(std::size_t k = 0; k < N; k++) {
x += std::sqrt(x);
}
}
int main() {
{
AutoProfiler p("N = 10");
foo(10);
}
{
AutoProfiler p("N = 1,000,000");
foo(1000000);
}
}
This timer works thanks to RAII. When you build the object within an scope you store the timepoint at that point in time. When you leave the scope (that is, at the corresponding }
) the timer first stores the timepoint, then calculates the number of ticks (which you can convert to a human-readable duration), and finally prints it to screen.
Of course, boost::timer::auto_cpu_timer
is much more elaborate than my simple implementation, but I often find my implementation more than sufficient for my purposes.
Sample run in my computer:
$ g++ -o example example.com -std=c++14 -Wall -Wextra
$ ./example
N = 10 : 0 musec
N = 1,000,000 : 10103 musec
I really liked the implementation suggested by @Jarod42. I modified it a little bit to offer some flexibility on the desired "units" of the output.
It defaults to returning the number of elapsed microseconds (an integer, normally std::size_t
), but you can request the output to be in any duration of your choice.
I think it is a more flexible approach than the one I suggested earlier because now I can do other stuff like taking the measurements and storing them in a container (as I do in the example).
Thanks to @Jarod42 for the inspiration.
#include <cmath>
#include <string>
#include <chrono>
#include <algorithm>
#include <iostream>
template<typename Duration = std::chrono::microseconds,
typename F,
typename ... Args>
typename Duration::rep profile(F&& fun, Args&&... args) {
const auto beg = std::chrono::high_resolution_clock::now();
std::forward<F>(fun)(std::forward<Args>(args)...);
const auto end = std::chrono::high_resolution_clock::now();
return std::chrono::duration_cast<Duration>(end - beg).count();
}
void foo(std::size_t N) {
long double x {1.234e5};
for(std::size_t k = 0; k < N; k++) {
x += std::sqrt(x);
}
}
int main() {
std::size_t N { 1000000 };
// profile in default mode (microseconds)
std::cout << "foo(1E6) takes " << profile(foo, N) << " microseconds" << std::endl;
// profile in custom mode (e.g, milliseconds)
std::cout << "foo(1E6) takes " << profile<std::chrono::milliseconds>(foo, N) << " milliseconds" << std::endl;
// To create an average of `M` runs we can create a vector to hold
// `M` values of the type used by the clock representation, fill
// them with the samples, and take the average
std::size_t M {100};
std::vector<typename std::chrono::milliseconds::rep> samples(M);
for(auto & sample : samples) {
sample = profile(foo, N);
}
auto avg = std::accumulate(samples.begin(), samples.end(), 0) / static_cast<long double>(M);
std::cout << "average of " << M << " runs: " << avg << " microseconds" << std::endl;
}
Output (compiled with g++ example.cpp -std=c++14 -Wall -Wextra -O3
):
foo(1E6) takes 10073 microseconds
foo(1E6) takes 10 milliseconds
average of 100 runs: 10068.6 microseconds
You can do it the MatLab way. It's very old-school but simple is often good:
tic();
a = f(c);
toc(); //print to stdout, or
auto elapsed = toc(); //store in variable
tic()
and toc()
can work to a global variable. If that's not sufficient, you can create local variables with some macro-magic:
tic(A);
a = f(c);
toc(A);
I'm a fan of using RAII wrappers for this type of stuff.
The following example is a little verbose but it's more flexible in that it works with arbitrary scopes instead of being limited to a single function call:
class timing_context {
public:
std::map<std::string, double> timings;
};
class timer {
public:
timer(timing_context& ctx, std::string name)
: ctx(ctx),
name(name),
start(std::clock()) {}
~timer() {
ctx.timings[name] = static_cast<double>(std::clock() - start) / static_cast<double>(CLOCKS_PER_SEC);
}
timing_context& ctx;
std::string name;
std::clock_t start;
};
timing_context ctx;
int main() {
timer_total(ctx, "total");
{
timer t(ctx, "foo");
// Do foo
}
{
timer t(ctx, "bar");
// Do bar
}
// Access ctx.timings
}
The downside is that you might end up with a lot of scopes that only serve to destroy the timing object.
This might or might not satisfy your requirements as your request was a little vague but it illustrates how using RAII semantics can make for some really nice reusable and clean code. It can probably be modified to look a lot better too!
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With