I am a new OpenCL programmer, and I am confused about how to set the workgroup size. Which is the correct way to set the workgroup size:
clEnqueueNDRangeKernel
in host code.__attribute__((reqd_work_group_size(X, Y, Z)))
in kernel code.This ensures that the right workgroup size is passed in. Typically, the necessary size of local memory is a function of the workgroup size. E.g. working on a 16x16 tile of an image.
E.g. one can write:
__attribute__((reqd_work_group_size(16, 16, 1)))
kernel foo void(...) {
local float tile[16][16]; // compiler allocates local memory
...
}
The compiler allocates the local memory and we needn't pass it in as an explicit argument. However, we need to ensure that the workgroup size matches that assumption. This attribute does exactly that.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With