Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Measuring execution time of OpenCL kernels

I have the following loop that measures the time of my kernels:

double elapsed = 0;
cl_ulong time_start, time_end;
for (unsigned i = 0; i < NUMBER_OF_ITERATIONS; ++i)
{
    err = clEnqueueNDRangeKernel(queue, kernel, 1, NULL, &global, NULL, 0, NULL, &event); checkErr(err, "Kernel run");
    err = clWaitForEvents(1, &event); checkErr(err, "Kernel run wait fro event");
    err = clGetEventProfilingInfo(event, CL_PROFILING_COMMAND_START, sizeof(time_start), &time_start, NULL); checkErr(err, "Kernel run get time start");
    err = clGetEventProfilingInfo(event, CL_PROFILING_COMMAND_END, sizeof(time_end), &time_end, NULL); checkErr(err, "Kernel run get time end");
    elapsed += (time_end - time_start);
}

Then I divide elapsed by NUMBER_OF_ITERATIONS to get the final estimate. However, I am afraid the execution time of individual kernels is too small and hence can introduce uncertainty into my measurement. How can I measure the time taken by all NUMBER_OF_ITERATIONS kernels combined?

Can you suggest a profiling tool, which could help with this, as I do not need to access this data programmatically. I use NVIDIA's OpenCL.

like image 988
user1096294 Avatar asked May 08 '14 19:05

user1096294


1 Answers

you need follow next steps to measure the execution time of OpenCL kernel execution time:

  1. Create a queue, profiling need been enable when the queue is created:

    cl_command_queue command_queue;
    command_queue = clCreateCommandQueue(context, devices[deviceUsed], CL_QUEUE_PROFILING_ENABLE, &err);
    
  2. Link an event when launch a kernel

    cl_event event;
    err=clEnqueueNDRangeKernel(queue, kernel, woridim, NULL, workgroupsize, NULL, 0, NULL, &event);
    
  3. Wait for the kernel to finish

    clWaitForEvents(1, &event);
    
  4. Wait for all enqueued tasks to finish

    clFinish(queue);
    
  5. Get profiling data and calculate the kernel execution time (returned by the OpenCL API in nanoseconds)

    cl_ulong time_start;
    cl_ulong time_end;
    
    clGetEventProfilingInfo(event, CL_PROFILING_COMMAND_START, sizeof(time_start), &time_start, NULL);
    clGetEventProfilingInfo(event, CL_PROFILING_COMMAND_END, sizeof(time_end), &time_end, NULL);
    
    double nanoSeconds = time_end-time_start;
    printf("OpenCl Execution time is: %0.3f milliseconds \n",nanoSeconds / 1000000.0);
    
like image 91
Dongwei Wang Avatar answered Oct 13 '22 01:10

Dongwei Wang