I'm having trouble passing a vector type (uint8) parameter to an OpenCL kernel function from the host code in C.
In the host I've got the data in an array:
cl_uint dataArr[8] = { 1, 2, 3, 4, 5, 6, 7, 8 };
(My real data is more than just [1, 8]; this is just for ease of explanation.)
I then transfer the data over to a buffer to be passed to the kernel:
cl_mem kernelInputData = clCreateBuffer(context,
CL_MEM_READ_ONLY | CL_MEM_COPY_HOST_PTR, sizeof(cl_uint)*8, dataArr, NULL);
Next, I pass this buffer into the kernel:
clSetKernelArg(kernel, 0, sizeof(cl_mem), &kernelInputData);
And the kernel function's signature looks something like this:
kernel void kernelFunction(constant uint8 *vectorPtr)
However, the kernel doesn't seem to be obtaining the correct input data from the pointer to kernelInputData
. When I pass values back from within the kernel, I see that vectorPtr
points to something with this structure: ( 1, 2, 3, 4, 5, ?, ?, ? )
where the question marks are usually 4293848814
but sometimes 0
. Either way, not what they're supposed to be.
What am I doing wrong?
EDIT:
I've switched from using an array to cl_uint8 on the host side. I now have:
cl_uint8 dataVector = { 1, 2, 3, 4, 5, 6, 7, 8 };
And I pass this vector to the kernel like so:
clSetKernelArg(kernel, 0, sizeof(cl_uint8), &dataVector);
And the kernel function's signature looks something like this:
kernel void kernelFunction(constant uint8 *vectorPtr)
However, running this code gives me a CL_INVALID_ARG_SIZE
error on clSetKernelArg()
. This error goes away if I switch the ARG_SIZE
paramater to sizeof(cl_uint8 *)
but then I get an EXC_BAD_ACCESS
error in __dynamic_cast
within clSetKernelArg()
.
My device is:
Apple Macbook Pro (mid-2009)
OSX 10.8 Mountain Lion
NVIDIA GeForce 9400M
OpenCL 1.0
CLH 1.0
You are defining an array of cl_uint of size 8. The creation of the cl_mem and the setting of kernel argument are right. But your kernel argument isn't correct: you try to read an array of cl_uint8 instead of cl_uint.
If you want to use a vector data type, you must declare: cl_uint8 dataArr of size 1.
Or if you want to use an array of size 8: kernel void kernelFunction(constant uint *vectorPtr, uint size):
Edit:
The kernel parameter for cl_uint8 dataVector
is not a pointer.
So, the correct code is:
cl_uint8 dataVector = { 1, 2, 3, 4, 5, 6, 7, 8 };
clSetKernelArg(kernel, 0, sizeof(cl_uint8), &dataVector);
and
kernel void kernelFunction(constant uint8 vectorPtr)
Minimal runnable example
int2
is passed to the kernel. It is initially created as an array of cl_int
.
#include <assert.h>
#include <stdlib.h>
#include <CL/cl.h>
int main(void) {
const char *source =
"__kernel void main(__global int2 *out) {\n"
" out[get_global_id(0)]++;\n"
"}\n";
cl_command_queue command_queue;
cl_context context;
cl_device_id device;
cl_int input[] = {0, 1, 2, 3};
const size_t global_work_size = sizeof(input) / sizeof(cl_int2);
cl_kernel kernel;
cl_mem buffer;
cl_platform_id platform;
cl_program program;
clGetPlatformIDs(1, &platform, NULL);
clGetDeviceIDs(platform, CL_DEVICE_TYPE_ALL, 1, &device, NULL);
context = clCreateContext(NULL, 1, &device, NULL, NULL, NULL);
command_queue = clCreateCommandQueue(context, device, 0, NULL);
buffer = clCreateBuffer(context, CL_MEM_READ_WRITE | CL_MEM_COPY_HOST_PTR, sizeof(input), &input, NULL);
program = clCreateProgramWithSource(context, 1, &source, NULL, NULL);
clBuildProgram(program, 1, &device, "", NULL, NULL);
kernel = clCreateKernel(program, "main", NULL);
clSetKernelArg(kernel, 0, sizeof(cl_mem), &buffer);
clEnqueueNDRangeKernel(command_queue, kernel, 1, NULL, &global_work_size, NULL, 0, NULL, NULL);
clFlush(command_queue);
clFinish(command_queue);
clEnqueueReadBuffer(command_queue, buffer, CL_TRUE, 0, sizeof(input), &input, 0, NULL, NULL);
assert(input[0] == 1);
assert(input[1] == 2);
assert(input[2] == 3);
assert(input[3] == 4);
return EXIT_SUCCESS;
}
Tested on Ubuntu 15.10 OpenCL 1.2 NVIDIA 352.53.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With