Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

CUDA: Difference between CPU timer and CUDA timer event?

Tags:

cuda

timer

What is the difference between using a CPU timer and the CUDA timer event to measure the time taken for the execution of some CUDA code?
Which of these should a CUDA programmer use?
And why?


what I know:

CPU timer usage would involve calling cudaThreadSynchronize before any time is noted. For noting the time, one of these could be used:

  1. clock()
  2. high-resolution performance counter like QueryPerformanceCounter (on Windows)

CUDA timer event would involve recording before and after by using cudaEventRecord. At a later time, the elapsed time would be obtained by calling cudaEventSynchronize on the events, followed by cudaEventElapsedTime to obtain the elapsed time.

like image 378
Ashwin Nanjappa Avatar asked Apr 29 '11 06:04

Ashwin Nanjappa


1 Answers

The answer to the first part of question is that cudaEvents timers are based off high resolution counters on board the GPU, and they have lower latency and better resolution than using a host timer because they come "off the metal". You should expect sub-microsecond resolution from the cudaEvents timers. You should prefer them for timing GPU operations for precisely that reason. The per-stream nature of cudaEvents can also be useful for instrumenting asynchronous operations like simultaneous kernel execution and overlapped copy and kernel execution. Doing that sort of time measurement is just about impossible using host timers.

EDIT: I won't answer the last paragraph because you deleted it.

like image 170
talonmies Avatar answered Oct 15 '22 20:10

talonmies