Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Optimal Local/Global worksizes in OpenCL

Tags:

opencl

I am wondering how to chose optimal local and global work sizes for different devices in OpenCL? Is it any universal rule for AMD, NVIDIA, INTEL GPUs? Should I analyze physical build of the devices (number of multiprocessors, number of streaming processors in multiprocessor, etc)?

Does it depends on the algorithm/implementation? Because I saw that some libraries (like ViennaCL) to assess correct values just tests many combination of local/global work sizes and chose best combination.

like image 394
Krzysztof Bzowski Avatar asked Dec 12 '22 18:12

Krzysztof Bzowski


1 Answers

NVIDIA recommends that your (local)workgroup-size is a multiple of 32 (equal to one warp, which is their atomic unit of execution, meaning that 32 threads/work-items are scheduled atomically together). AMD on the other hand recommends a multiple of 64(equal to one wavefront). Unsure about Intel, but you can find this type of information in their documentation.

So when you are doing some computation and let say you have 2300 work-items (the global size), 2300 is not dividable by 64 nor 32. If you don't specify the local size, OpenCL will choose a bad local size for you. What happens when you don't have a local size which is a multiple of the atomic unit of execution is that you will get idle threads which leads to bad device utilization. Thus, it can be benificial to add some "dummy" threads so that you get a global size which is a multiple of 32/64 and then use a local size of 32/64 (the global size has to be dividable by the local size). For 2300 you can add 4 dummy threads/work-items, because 2304 is dividable by 32. In the actual kernel, you can write something like:

int globalID = get_global_id(0);
if(globalID >= realNumberOfThreads)
globalID = 0;

This will make the four extra threads do the same as thread 0. (it is often faster to do some extra work then to have many idle threads).

Hope that answered your question. GL HF!

like image 194
Erik Smistad Avatar answered Jun 24 '23 15:06

Erik Smistad