I'm trying to templatize a CUDA kernel based on a boolean variable (as shown here: Should I unify two similar kernels with an 'if' statement, risking performance loss?), but I keep getting a compiler error that says my function is not a template. I think that I'm just missing something obvious so it's pretty frustrating.
The following does NOT work:
util.cuh
#include "kernels.cuh"
//Utility functions
kernels.cuh
#ifndef KERNELS
#define KERNELS
template<bool approx>
__global__ void kernel(...params...);
#endif
kernels.cu
template<bool approx>
__global__ void kernel(...params...)
{
if(approx)
{
//Approximate calculation
}
else
{
//Exact calculation
}
}
template __global__ void kernel<false>(...params...); //Error occurs here
main.cu
#include "kernels.cuh"
kernel<false><<<dimGrid,dimBlock>>>(...params...);
The following DOES work:
util.cuh
#include "kernels.cuh"
//Utility functions
kernels.cuh
#ifndef KERNELS
#define KERNELS
template<bool approx>
__global__ void kernel(...params...);
template<bool approx>
__global__ void kernel(...params...)
{
if(approx)
{
//Approximate calculation
}
else
{
//Exact calculation
}
}
#endif
main.cu
#include "kernels.cuh"
kernel<false><<<dimGrid,dimBlock>>>(...params...);
If I throw in the
template __global__ void kernel<false>(...params...);
line at the end of kernels.cuh it also works.
I get the following errors (both referring to the marked line above):
kernel is not a template
invalid explicit instantiation declaration
If it makes a difference I compile all of my .cu files in one line, like:
nvcc -O3 -arch=sm_21 -I. main.cu kernels.cu -o program
All explicit specialization declarations must be visible at the time of the template instantiation. Your explicit specialization declaration is visible only in the kernels.cu translation unit, but not in main.cu.
The following code is indeed working correctly (apart from adding a __global__
qualifier at the explicit instantiation instruction).
#include<cuda.h>
#include<cuda_runtime.h>
#include<stdio.h>
#include<conio.h>
template<bool approx>
__global__ void kernel()
{
if(approx)
{
printf("True branch\n");
}
else
{
printf("False branch\n");
}
}
template __global__ void kernel<false>();
int main(void) {
kernel<false><<<1,1>>>();
getch();
return 0;
}
EDIT
In C++, templated functions are not compiled until an explicit instantiation of the function is encountered. From this point of view, CUDA, which now fully supports templates, behaves exactly the same way as C++.
To make a concrete example, when the compiler finds something like
template<class T>
__global__ void kernel(...params...)
{
...
T a;
...
}
it just checks the function syntax, but produces no object code. So, if you would compile a file with a single templated function as above, you will have an "empty" object file. This is reasonable, since the compiler would not know which type assigning to a
.
The compiler produces an object code only when it encounters an explicit instantiation of the function template. This is, at that moment, how compilation of templated functions work and this behavior introduces a restriction for multiple-file projects: the implementation (definition) of a templated function must be in the same file as its declaration. So, you cannot separate the interface contained in kernels.cuh
in a header file separated from kernels.cu
, which is the main reason why the first version of your code does not compile. Accordingly, you must include both interface and implementation in any file that uses the templates, namely, you must include in main.cu
both, kernels.cuh
and kernels.cu
.
Since no code is generated without an explicit instantiation, compilers tolerate the inclusion more than once of the same template file with both declarations and definitions in a project without generating linkage errors.
There are several tutorials on using templates in C++. An Idiot's Guide to C++ Templates - Part 1, apart from the irritating title, will provide you with a step-by-step introduction to the topic.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With