I have this code, for which I already posted something some time ago.
Today I got my kernel running with a typedef struct in a little test program, but clEnqueueNDRangeKernel
gives an invalid work group size error. This can have 3 causes, according to the khronos webiste.
NULL
. My local work size isn't NULL
, it's 128.I've searched the internet for quite some hours, and most solutions I found involves to query clGetKernelWorkGroupInfo
for the maximum local work size. When I do that, it also reports 1024. I'm really out of options now, can somebody help? :)
main: http://pastebin.com/S6R6t3iF kernel: http://pastebin.com/Mrhr8B4v
From your pastebin link, I see:
#define MAX_OP_X 4
#define MAX_OP_Y 4
#define MAX_OP MAX_OP_X * MAX_OP_Y //aantal observer points
#define MAX_SEGMENTEN 128 //aantal segmenten
...
size_t globalSize = MAX_OP;
size_t localSize = MAX_SEGMENTEN;
...
errMsg = clEnqueueNDRangeKernel (commandQueue, kernel, 1, NULL, &globalSize, &localSize, 0, NULL, NULL);
This means you are trying to enqueue your kernel with a global size of 16, and a local size of 128. That's almost certainly not what you want. Remember, global size is the total number of work items you want to run, and the local size is the size of each workgroup. For example, if you have a global size of 1024x1024, and a local size of 16x16, you would have 4096 workgroups of 256 work items each. This may or may not be valid, depending on your compute device.
With regards to passing a NULL local size: the CL spec says that if you do that, the CL implementation can choose whatever it wants as the local workgroup size. Ideally, it will try to do something clever on your behalf, but you have no guarantees.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With