Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Determine max global work group size based on device memory in OpenCL?

Tags:

opencl

I am able to list the following parameters which help in restricting the work items for a device based on the device memory:

  • CL_DEVICE_GLOBAL_MEM_SIZE
  • CL_DEVICE_LOCAL_MEM_SIZE
  • CL_DEVICE_MAX_CONSTANT_BUFFER_SIZE
  • CL_DEVICE_MAX_MEM_ALLOC_SIZE
  • CL_DEVICE_MAX_WORK_GROUP_SIZE
  • CL_DEVICE_MAX_WORK_ITEM_SIZES
  • CL_KERNEL_WORK_GROUP_SIZE

I find the explanation for these parameters insufficient and hence I am not able to use these parameters properly. Can somebody please tell me what these parameters mean and how they are used. Is it necessary to check all these parameters?

PS: I have some brief understanding of some of the parameters but I am not sure whether my understanding is correct.

like image 650
Cool_Coder Avatar asked Apr 11 '14 15:04

Cool_Coder


People also ask

What is work group size in OpenCL?

I read somewhere (for the case in which we don't specify the local work size) that openCL creates 3 work groups(of 217 work-items each) for kernel with 651 work-items(divisible by 3) while it creates 653 work-groups of 1 work-item each, as 653 is a prime number.

What is work group OpenCL?

Work-items Each work-item in OpenCL is a thread in terms of its control flow, and its memory model. The hardware may run multiple work-items on a single thread, and you can easily picture this by imagining four OpenCL work-items operating on the separate lanes of an SSE vector.

What is kernel in OpenCL?

A kernel is essentially a function written in the OpenCL language that enables it to be compiled for execution on any device that supports OpenCL. The kernel is the only way the host can call a function that will run on a device. When the host invokes a kernel, many work items start running on the device.


1 Answers

CL_DEVICE_GLOBAL_MEM_SIZE:

  • Global memory amount of the device. You typically don't care, unless you use high amount of data. Anyway the OpenCL spec will complain about OUT_OF_RESOURCES error if you use more than allowed. (bytes)

CL_DEVICE_LOCAL_MEM_SIZE:

  • Amount of local memory for each workgroup. However, this limit is just under ideal conditions. If your kernel uses high amount of WI per WG maybe some of the private WI data is being spilled out to local memory. So take it as a maximum available amount per WG.

CL_DEVICE_MAX_CONSTANT_BUFFER_SIZE:

  • The maximum amount of constant memory that can be used for a single kernel. If you use constant buffers that all together have more than this amount, either it will fail, or use global normal memory instead (it may therefore be slower). (bytes)

CL_DEVICE_MAX_MEM_ALLOC_SIZE:

  • The maximum amount of memory in 1 single piece you can allocate in a device. (bytes)

CL_DEVICE_MAX_WORK_GROUP_SIZE:

  • Maximum work group size of the device. This is the ideal maximum. Depending on the kernel code the limit may be lower.

CL_DEVICE_MAX_WORK_ITEM_SIZES:

  • The maximum amount of work items per dimension. IE: The device may have 1024 WI as maximum size and 3 maximum dimensions. But you may not be able to use (1024,1,1) as size, since it may be limited to (64,64,64), so, you can only do (64,2,8) for example.

CL_KERNEL_WORK_GROUP_SIZE:

  • The default kernel size given by the implementation. It may be forced to be higher, or lower, but the value already provided should be a good one already (good tradeoff of GPU usage %, memory spill off, etc).

NOTE: All this data is the theoretical limits. But if your kernel uses a resource more than other, ie: local memory depending on the size of the work group, you may not be able to reach the maximum work items per work group, since it is possible you reach first the local memory limit.

like image 199
DarkZeros Avatar answered Sep 28 '22 18:09

DarkZeros