In this tutorial
There are 2 methods to run the kernel, and another one mentioned in the comments:
1.
cl::KernelFunctor simple_add(cl::Kernel(program,"simple_add"),queue,cl::NullRange,cl::NDRange(10),cl::NullRange);
simple_add(buffer_A,buffer_B,buffer_C);
However, I found out, that KernelFunctor has gone.
So I tried the alternative way:
2.
cl::Kernel kernel_add=cl::Kernel(program,"simple_add");
kernel_add.setArg(0,buffer_A);
kernel_add.setArg(1,buffer_B);
kernel_add.setArg(2,buffer_C);
queue.enqueueNDRangeKernel(kernel_add,cl::NullRange,cl::NDRange(10),cl::NullRange);
queue.finish();
It compiles and runs succussfully.
However, there is a 3rd option in the comments:
3.
cl::make_kernel simple_add(cl::Kernel(program,"simple_add"));
cl::EnqueueArgs eargs(queue,cl::NullRange,cl::NDRange(10),cl::NullRange);
simple_add(eargs, buffer_A,buffer_B,buffer_C).wait();
Which does not compile, I think the make_kernel needs template arguments. I'm new to OpenCl, and didn't manage to fix the code.
My question is:
1. How should I modify the 3. code to compile?
2. Which way is better and why? 2. vs. 3.?
The procedure to build (compile) and install the latest Linux kernel from source is as follows: Grab the latest kernel from kernel.org. Verify kernel. Untar the kernel tarball.
Popular kernels are: Polynomial Kernel, Gaussian Kernel, Radial Basis Function (RBF), Laplace RBF Kernel, Sigmoid Kernel, Anove RBF Kernel, etc (see Kernel Functions or a more detailed description Machine Learning Kernels).
It is not difficult to write a simple OS kernel, especially if you have looked at a few others already. What is difficult is writing one in any way better than the dozens out there already, and/or duplicating the huge amount of functionality that Linux has built up over its decades of life.
You can check the OpenCL C++ Bindings Specification for a detailed description of the cl::make_kernel
API (in section 3.6.1), which includes an example of usage.
In your case, you could write something like this to create the kernel functor:
auto simple_add = cl::make_kernel<cl::Buffer&, cl::Buffer&, cl::Buffer&>(program, "simple_add");
Your second question is primarily opinion based, and so is difficult to answer. One could argue that the kernel functor approach is simpler, as it allows you to 'call' the kernel almost as if it were just a function and pass the arguments in a familiar manner. The alternative approach (option 2 in your question) is more explicit about setting arguments and enqueuing the kernel, but more closely represents how you would write the same code using the OpenCL C API. Which method you use is entirely down to personal preference.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With