So accurate timing is important to me, and I was investigating the 3 types of clocks specified in C++ 11, namely system_clock
, steady_clock
, and high_resolution_clock
.
My initial concern was testing if there is any difference in call overhead to the different types of clocks, and to check the resolution of each type of clock.
Here is my sample program:
#include <chrono>
#include <cstdio>
using namespace std;
using namespace std::chrono;
int main(int argc, char **argv)
{
size_t N = 1e6;
if(2 == argc) {
sscanf(argv[1], "%zu", &N);
}
#if defined(hrc)
typedef high_resolution_clock clock;
#warning "High resolution clock"
#elif defined(sc)
typedef steady_clock clock;
#warning "Steady clock"
#elif defined(sys)
typedef system_clock clock;
#warning "System clock"
#endif
const double resolution = double(clock::period::num) / double(clock::period::den);
printf("clock::period: %lf us.\n", resolution*1e6);
printf("clock::is_steady: %s\n", clock::is_steady ? "yes" : "no");
printf("Calling clock::now() %zu times...\n", N);
// first, warm up
for(size_t i=0; i<100; ++i) {
time_point<clock> t = clock::now();
}
// loop N times
time_point<clock> start = clock::now();
for(size_t i=0; i<N; ++i) {
time_point<clock> t = clock::now();
}
time_point<clock> end = clock::now();
// display duration
duration<double> time_span = duration_cast<duration<double>>(end-start);
const double sec = time_span.count();
const double ns_it = sec*1e9/N;
printf("That took %lf seconds. That's %lf ns/iteration.\n", sec, ns_it);
return 0;
}
I compile it with
$ g++-4.7 -std=c++11 -Dhrc chrono.cpp -o hrc_chrono
chrono.cpp:15:2: warning: #warning "High resolution clock" [-Wcpp]
$ g++-4.7 -std=c++11 -Dsys chrono.cpp -o sys_chrono
chrono.cpp:15:2: warning: #warning "System clock" [-Wcpp]
$ g++-4.7 -std=c++11 -Dsc chrono.cpp -o sc_chrono
chrono.cpp:15:2: warning: #warning "Steady clock" [-Wcpp]
I compiled with G++ 4.7.2, and ran it on
The first surprising thing was that the 3 types of clock are apparently synonyms. They all have the same period (1 micro sec), and the time/call is practically the same. What's the point of specifying 3 types of clocks if they are all the same? Is this just because the G++ implementation of chrono
isn't mature yet? Or maybe the 3.1.10 kernel only has one user-accessible clock?
The second surprise, and this is huge, is that steady_clock::is_steady == false. I'm fairly certain that by definition, that property should be true. What gives?? How can I work around it (ie, achieve a steady clock)?
If you can run the simple program on other platforms/compilers, I would be very interested to know the results. If anybody is wondering, it's about 25 ns/iteration on my Core i7, and 1000 ns/iteration on the Tegra 2.
steady_clock
is supported for GCC 4.7 (as shown by the docs for the 4.7 release: http://gcc.gnu.org/onlinedocs/gcc-4.7.2/libstdc++/manual/manual/status.html#status.iso.2011) and steady_clock::is_steady
is true but only if you build GCC with --enable-libstdcxx-time=rt
See https://stackoverflow.com/a/12961816/981959 for details of that configuration option.
For GCC 4.9 it will be enabled automatically if your OS and C library supports POSIX monotonic clocks for clock_gettime
(which is true for GNU/Linux with glibc 2.17 or later and for Solaris 10, IIRC)
Here are the results with GCC 4.8 configured with --enable-libstdcxx-time=rt
on an AMD Phenom II X4 905e, 2.5GHz but I think it's throttled to 800MHz right now, running Linux 3.6.11, glibc 2.15
$ ./hrc
clock::period: 0.001000 us.
clock::is_steady: no
Calling clock::now() 1000000 times...
That took 0.069646 seconds. That's 69.645928 ns/iteration.
$ ./sys
clock::period: 0.001000 us.
clock::is_steady: no
Calling clock::now() 1000000 times...
That took 0.062535 seconds. That's 62.534986 ns/iteration.
$ ./sc
clock::period: 0.001000 us.
clock::is_steady: yes
Calling clock::now() 1000000 times...
That took 0.065684 seconds. That's 65.683730 ns/iteration.
And with GCC 4.7 without --enable-libstdcxx-time
(so the same results for all three clock types) on ARMv7 Exynos5 running Linux 3.4.0, glibc 2.16
clock::period: 1.000000 us.
clock::is_steady: no
Calling clock::now() 1000000 times...
That took 1.089904 seconds. That's 1089.904000 ns/iteration.
If you can run the simple program on other platforms/compilers, I would be very interested to know the results.
Mac OS X 10.8, clang++ / libc++, -O3, 2.8 GHz Core i5:
High resolution clock
clock::period: 0.001000 us.
clock::is_steady: yes
Calling clock::now() 1000000 times...
That took 0.021833 seconds. That's 21.832827 ns/iteration.
System clock
clock::period: 1.000000 us.
clock::is_steady: no
Calling clock::now() 1000000 times...
That took 0.041930 seconds. That's 41.930000 ns/iteration.
Steady clock
clock::period: 0.001000 us.
clock::is_steady: yes
Calling clock::now() 1000000 times...
That took 0.021478 seconds. That's 21.477953 ns/iteration.
steady_clock
and system_clock
are required to be distinct types. steady_clock::is_steady
is required to be true
. high_resolution_clock
may be a distinct type or an alias of steady_clock
or system_clock
. system_clock::rep
must be a signed type.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With