I'm trying to compare GPU to CPU performance. For the NVIDIA GPU I've been using the cudaEvent_t
types to get a very precise timing.
For the CPU I've been using the following code:
// Timers clock_t start, stop; float elapsedTime = 0; // Capture the start time start = clock(); // Do something here ....... // Capture the stop time stop = clock(); // Retrieve time elapsed in milliseconds elapsedTime = (float)(stop - start) / (float)CLOCKS_PER_SEC * 1000.0f;
Apparently, that piece of code is only good if you're counting in seconds. Also, the results sometime come out quite strange.
Does anyone know of some way to create a high resolution timer in Linux?
The High Resolution Timers system allows a user space program to be wake up from a timer event with better accuracy, when using the POSIX timer APIs. Without this system, the best accuracy that can be obtained for timer events is 1 jiffy. This depends on the setting of HZ in the kernel.
This patch introduces a new subsystem for high-resolution kernel timers. One might ask the question: we already have a timer subsystem (kernel/timers.
The default timer resolution on Windows is 15.6 ms – a timer interrupt 64 times a second. When programs increase the timer frequency they increase power consumption and harm battery life.
Check out clock_gettime
, which is a POSIX interface to high-resolution timers.
If, having read the manpage, you're left wondering about the difference between CLOCK_REALTIME
and CLOCK_MONOTONIC
, see Difference between CLOCK_REALTIME and CLOCK_MONOTONIC?
See the following page for a complete example: http://www.guyrutenberg.com/2007/09/22/profiling-code-using-clock_gettime/
#include <iostream> #include <time.h> using namespace std; timespec diff(timespec start, timespec end); int main() { timespec time1, time2; int temp; clock_gettime(CLOCK_PROCESS_CPUTIME_ID, &time1); for (int i = 0; i< 242000000; i++) temp+=temp; clock_gettime(CLOCK_PROCESS_CPUTIME_ID, &time2); cout<<diff(time1,time2).tv_sec<<":"<<diff(time1,time2).tv_nsec<<endl; return 0; } timespec diff(timespec start, timespec end) { timespec temp; if ((end.tv_nsec-start.tv_nsec)<0) { temp.tv_sec = end.tv_sec-start.tv_sec-1; temp.tv_nsec = 1000000000+end.tv_nsec-start.tv_nsec; } else { temp.tv_sec = end.tv_sec-start.tv_sec; temp.tv_nsec = end.tv_nsec-start.tv_nsec; } return temp; }
To summarise information presented so far, these are the two functions required for typical applications.
#include <time.h> // call this function to start a nanosecond-resolution timer struct timespec timer_start(){ struct timespec start_time; clock_gettime(CLOCK_PROCESS_CPUTIME_ID, &start_time); return start_time; } // call this function to end a timer, returning nanoseconds elapsed as a long long timer_end(struct timespec start_time){ struct timespec end_time; clock_gettime(CLOCK_PROCESS_CPUTIME_ID, &end_time); long diffInNanos = (end_time.tv_sec - start_time.tv_sec) * (long)1e9 + (end_time.tv_nsec - start_time.tv_nsec); return diffInNanos; }
Here is an example of how to use them in timing how long it takes to calculate the variance of a list of input.
struct timespec vartime = timer_start(); // begin a timer called 'vartime' double variance = var(input, MAXLEN); // perform the task we want to time long time_elapsed_nanos = timer_end(vartime); printf("Variance = %f, Time taken (nanoseconds): %ld\n", variance, time_elapsed_nanos);
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With