I am able to list the following parameters which help in restricting the work items for a device based on the device memory:
I find the explanation for these parameters insufficient and hence I am not able to use these parameters properly. Can somebody please tell me what these parameters mean and how they are used. Is it necessary to check all these parameters?
PS: I have some brief understanding of some of the parameters but I am not sure whether my understanding is correct.
I read somewhere (for the case in which we don't specify the local work size) that openCL creates 3 work groups(of 217 work-items each) for kernel with 651 work-items(divisible by 3) while it creates 653 work-groups of 1 work-item each, as 653 is a prime number.
Work-items Each work-item in OpenCL is a thread in terms of its control flow, and its memory model. The hardware may run multiple work-items on a single thread, and you can easily picture this by imagining four OpenCL work-items operating on the separate lanes of an SSE vector.
A kernel is essentially a function written in the OpenCL language that enables it to be compiled for execution on any device that supports OpenCL. The kernel is the only way the host can call a function that will run on a device. When the host invokes a kernel, many work items start running on the device.
CL_DEVICE_GLOBAL_MEM_SIZE:
CL_DEVICE_LOCAL_MEM_SIZE:
CL_DEVICE_MAX_CONSTANT_BUFFER_SIZE:
CL_DEVICE_MAX_MEM_ALLOC_SIZE:
CL_DEVICE_MAX_WORK_GROUP_SIZE:
CL_DEVICE_MAX_WORK_ITEM_SIZES:
CL_KERNEL_WORK_GROUP_SIZE:
NOTE: All this data is the theoretical limits. But if your kernel uses a resource more than other, ie: local memory depending on the size of the work group, you may not be able to reach the maximum work items per work group, since it is possible you reach first the local memory limit.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With