I'm writing a kernel module for a special camera, working through V4L2 to handle transfer of frames to userspace code.. Then I do lots of userspace stuff in the app..
Timing is very critical here, so I've been doing lots of performance profiling and plain old std::chrono::steady_clock
stuff to track timing, but I've reached the point where I need to also collect timing data from the Kernel side of things so that I can analyze the entire path from hardware interrupt through V4L DQBuf to userspace...
Can anyone recommend a good way to get high-resolution timing data, that would be consistent with application userspace data, that I could use for such comparisons? Right now I'm measuring activity in microseconds..
Ubuntu 12.04 LTS
At the lowest level, there are the rdtsc
and rdtscp
instructions if you're on an x86/x86-64 processor. That should provide the lowest overhead, highest possible resolution across the kernel/userspace boundary.
However, there are things you need to worry about. You need to make sure you're executing across the same core/cpu, the process isn't being context switched, and the frequency isn't changing across invocations. If the cpu supports an invariant tsc, (constant_tsc
in /proc/cpuinfo
) it's a little more reliable across cpus/cores and frequencies.
This should provide roughly nanosecond accuracy.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With