CUDA: Difference between CPU timer and CUDA timer event?

Question

What is the difference between using a CPU timer and the CUDA timer event to measure the time taken for the execution of some CUDA code?
Which of these should a CUDA programmer use？
And why?

what I know:

CPU timer usage would involve calling cudaThreadSynchronize before any time is noted. For noting the time, one of these could be used:

clock()
high-resolution performance counter like QueryPerformanceCounter (on Windows)

CUDA timer event would involve recording before and after by using cudaEventRecord. At a later time, the elapsed time would be obtained by calling cudaEventSynchronize on the events, followed by cudaEventElapsedTime to obtain the elapsed time.

talonmies · Accepted Answer

The answer to the first part of question is that cudaEvents timers are based off high resolution counters on board the GPU, and they have lower latency and better resolution than using a host timer because they come "off the metal". You should expect sub-microsecond resolution from the cudaEvents timers. You should prefer them for timing GPU operations for precisely that reason. The per-stream nature of cudaEvents can also be useful for instrumenting asynchronous operations like simultaneous kernel execution and overlapped copy and kernel execution. Doing that sort of time measurement is just about impossible using host timers.

EDIT: I won't answer the last paragraph because you deleted it.

CUDA: Difference between CPU timer and CUDA timer event?

Tags:

cuda

timer

Ashwin Nanjappa

1 Answers

talonmies

Recent Activity

Donate For Us

CUDA: Difference between CPU timer and CUDA timer event?

Tags:

cuda

timer

Ashwin Nanjappa

1 Answers

talonmies

Related questions

Recent Activity

Donate For Us