CUDA and templates: specialization declaration needed?

Question

I have a templated wrapper function that calls a kernel (__global__) defined in a .cu file like this

template<typename T, class M> 
__global__ void compute_kernel(T* input, T* output, n) {
    M m;
    // compute stuff using m
};

template<typename T, class M> 
void compute(T* input, T* output, int n) {
    // ... compute blocks, threads, etc.
    compute_kernel<T,M> <<<dim_grid, dim_block>>>(input, output, n);
    // ...
};

and a header file to be included in host code that has only the declaration

template<typename T, class M> 
void compute(T* input, T* output, int n);

However, calling compute() from the host with arbitrary template parameters, the compilation fails with undefined reference to 'void reduce(...)' and only if I add specialization declarations to the end of the .cu file does the code compile:

template void
compute<int, Method1<int> >(int* input, int* output, int n);

template void
compute<float, Method1<float> >(float* input, float* output, int n);

template void
compute<int, Method2<int> >(int* input, int* output, int n);

template void
compute<float, Method2<float> >(float* input, float* output, int n);

So, is it necessary to specialize every templated function in order to make it callable from the host? (That's quite a drawback)

Thanks for your comments!

Tom · Accepted Answer

This is a C++ FAQ, not limited to CUDA.

If you have a template implementation in a .cpp or .cu file then when you compile that translation unit the compiler cannot possibly know what permutations of template parameters you will need. Therefore when you link you will get the errors.

You could put the implementation in a header file (in which case you'll need to instantiate in a .cu file since it contains CUDA) or you will have to explicitly instantiate all required permutations. If you have to do many of these then you could use a macro to instantiate all your permutations.

CUDA and templates: specialization declaration needed?

Tags:

c++

cuda

bbtrb

1 Answers

Tom

Recent Activity

Donate For Us

CUDA and templates: specialization declaration needed?

Tags:

c++

cuda

bbtrb

1 Answers

Tom

Related questions

Recent Activity

Donate For Us