Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

OpenCL - How to I query for a device's SIMD width?

Tags:

gpgpu

gpu

opencl

In CUDA, there is a concept of a warp, which is defined as the maximum number of threads that can execute the same instruction simultaneously within a single processing element. For NVIDIA, this warp size is 32 for all of their cards currently on the market.

In ATI cards, there is a similar concept, but the terminology in this context is wavefront. After some hunting around, I found out that the ATI card I have has a wavefront size of 64.

My question is, what can I do to query for this SIMD width at runtime for OpenCL?

like image 993
Jonathan DeCarlo Avatar asked Aug 17 '11 13:08

Jonathan DeCarlo


3 Answers

I found the answer I was looking for. It turns out that you don't query the device for this information, you query the kernel object (in OpenCL). My source is:

http://www.hpc.lsu.edu/training/tutorials/sc10/tutorials/SC10Tutorials/docs/M13/M13.pdf

(Page 108)

which says:

The most efficient work group sizes are likely to be multiples of the native hardware execution width

  • wavefront size in AMD speak/warp size in Nvidia speak
  • Query device for CL_KERNEL_PREFERRED_WORK_GROUP_SIZE_MULTIPLE

So, in short, the answer appears to be to call the clGetKernelWorkGroupInfo() method with a param name of CL_KERNEL_PREFERRED_WORK_GROUP_SIZE_MULTIPLE. See this link for more information on this method:

http://www.khronos.org/registry/cl/sdk/1.1/docs/man/xhtml/clGetKernelWorkGroupInfo.html

like image 83
Jonathan DeCarlo Avatar answered Sep 21 '22 04:09

Jonathan DeCarlo


On AMD, you can query CL_DEVICE_WAVEFRONT_WIDTH_AMD. That's different from CL_DEVICE_SIMD_WIDTH_AMD, which returns the number of threads it executes in each clock cycle. The latter may be smaller than the wavefront size, in which case it takes multiple clock cycles to execute one instruction for all the threads in a wavefront.

like image 23
peastman Avatar answered Sep 19 '22 04:09

peastman


On NVIDIA, you can query the warp size width using clGetDeviceInfo with CL_DEVICE_WARP_SIZE_NV (although this is always 32 for current GPUs), however, this is an extension, as OpenCL defines nothing like warps or wavefronts. I don't know about any AMD extension that would allow to query for the wavefront size.

like image 30
Radim Vansa Avatar answered Sep 21 '22 04:09

Radim Vansa