Exactly as the title suggests I am looking for how to effectively swap two OpenCL buffers. My kernel uses two gloabl buffers, one as input and one as output. However, I invoke my kernel in a for loop with the same NDRange, each time setting the kernel arguments, enqueueing the kernel, and swapping the buffers because the previous output buffer will be the input buffer seed for the next iteration.
What is the appropriate way here, to swap these two buffers? I imagine that copying the buffer back to the host to one of the already malloc'd arrays and copying it into the next input buffer using clEnqueueWriteBuffer()
and clEnqueueReadBuffer()
is an inefficient way to go. Otherwise I am just using a temporary cl_mem
variable to do my swapping.
You don't need to, just set the right kernel args using clSetKernelArg
before enqueuing your kernel a second time (using clEnqueueNDRangeKernel
). The buffers will stay on the device, nothing will be copied back to the host.
Your buffer has to be created with CL_MEM_READ_WRITE
in this case of course.
As the previous answer: No, you don't need to swap buffers at all.
However, I don't agree with the proposed answer. The function clSetKernelArg()
is not thread safe, and is not designed to be called in the operation loop.
The proper solution is to create 2 kernels created with the same program and source. This approach is more aligned with the OpenCL programming philosophy "One kernel for one task". Having many kernels with the same code but different arguments is the way to go.
The first kernel will have:
kernel1 = clCreateKernel(program, "mykernel", NULL);
clSetKernelArg(kernel1, 0, &buff1);
clSetKernelArg(kernel1, 1, &buff2);
And the other one will be:
kernel2 = clCreateKernel(program, "mykernel", NULL);
clSetKernelArg(kernel2, 0, &buff2);
clSetKernelArg(kernel2, 1, &buff1);
This way, you don't need to stop the execution each iteration. You can simply run:
for(int it=0; it<iter; it++){
clEnqueueNDRangeKernel(it%2 ? kernel1 : kernel2, ....);
}
clFinish(command);
This approach will be surely better than changing the kernel args, more efficient, less API calls. Additionally on some systems, clSetKernelArgs()
may be a blocking call, due to poor API implementations. So it is better to avoid them as much as possible.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With