I want to create a local array inside my OpenCL kernel, whose size depends on a parameter of the kernel. It seems that's not allowed - at least with AMD APP.
Is your experience different? Perhaps it's just the APP? Or is is there some rationale here?
Edit: I would now suggest variable length arrays should be allowed in CPU-side code too, and it was an unfortunate call by the C standard committee; but the question stands.
You can dynamically allocate the size of a local block. You need to take it as a parameter to your kernel, and define its size when you call clSetKernelArg.
definition example:
__kernel void kernelName(__local float* myLocalFloats, ...)
host code:
clSetKernelArg(kernel, 0, myLocalFloatCount * sizeof(float), NULL); // <-- set the size to the correct number of bytes to allocate, but use NULL for the data.
Make sure you know what the limit for local memory is on your device before you do this. Call clGetDeviceInfo, and poll for the 'CL_DEVICE_LOCAL_MEM_SIZE' value.
Not sure why people are saying you can't do this as it is something many people do with OpenCL (Yes, I understand it's not exactly the same but it works well enough for many cases).
Since OpenCL kernels are compiled at runtime and, just like text, you can just simply set the size to whatever size you want and then recompile your kernel. This obviously won't be perfect in cases where you have huge variability in sizes but usually I compile several different sizes at startup and then just call the correct one as needed (in your case based on the kernel argument). If I get a new size I don't have a kernel for I will compile it right then and cache the kernel in case it comes up again.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With