Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Problem when calling template CUDA kernel

I've been trying to create template kernels but I'm been having some trouble calling them in my program. I have a Matrix<T> template class, and some methods defined inside it

Matrix.h:

template <typename T> class Matrix {
    ...
    void sum(Matrix<T>& m1, Matrix<T>& m2, Matrix<T>& sum);
    ...
}

#include "Matrix.cu"

Matrix.cu:

#include "MatrixKernel.h"

template<typename T> void Matrix<T>::sum(const Matrix<T>& m, Matrix<T>& sum) {
    ...
    sumKernel<T><<<dimGrid, dimBlock>>>(Matrix<T> m1, Matrix<T> m2, Matrix<T> sum)
    ...
}

MatrixKernel.h:

template<typename T> __global__ void sumKernel(const Matrix<T> m1, const Matrix<T> m2, Matrix<T> sum) {
...
}

The problem is that when I call sumKernel from inside of sum, the compiler gives me the following error:

error C2059: syntax error : '<'

Does somebody know what's going on? The code compiled fine just before I included the sumKernel call.

Thanks.

like image 295
Renan Avatar asked Jun 14 '11 10:06

Renan


2 Answers

So, it seems you do have a strange #include, leading to code getting compiled by the wrong compiler. Make a distinction between gpu headers and cpu headers by using .cu.h for cuda headers. Make sure only NVCC compiles .cu and .cu.h files. Cuda files should never be included in cpp files. The kernel and kernel call should be in a .cu or .cu.h files, and those files shouldn't be included anywhere in cpps.

Because your .cu is being included in a header which is being compiled by the host compiler, the host compiler ends up hitting the token <<< - which it doesn't recognise. It probably does understand the token << so it consumes that, leaving an unexpected <.

Here's an alternative way of doing things that should work (not tried it but it's similar to code we use)

(note, this might work but it also might not be the right way to solve the problem. My boss doesn't like it as a solution and would prefer to add an implementation per variation)

The underlying problem seems to be lack of distinction between host and device code. I'm leaving the detail out in my solution - things like copying results to and from the device, sum implementation, etc.

The problem I'm trying to solve is, given a construct, how can you template it for use both on the host and the device?

I'll template Matrix.h on both the type and the implementation detail.

 template <typename T, typename Implementation<T> > class Matrix {
     void sum(Matrix<T>& m1, Matrix<T>& m2, Matrix<T>& sum)
     {
         Implementation.sumImp(m1, m2, sum);
     }
 }

The host implementation, HostMatrixSum.h will do things the on the cpu:

 #include "Matrix.h"

 template <typename T> struct HostMatrixSum
 {
     void sumImp(Matrix<T>& m1, Matrix<T>& m2, Matrix<T>& sum)
     {
         ...
     }
 }

While GpuMatrixSum.cu.h will upload the matrix, do the sum and recover the results:

 #include "Matrix.h"

 template <typename T> struct GpuMatrixSum
 {   
     template<typename T> __global__ void sumKernel(const Matrix<T> m1, const Matrix<T> m2, Matrix<T> sum)
     {
         ...
     }

     void sumImp(Matrix<T>& m1, Matrix<T>& m2, Matrix<T>& sum)
     {
         ...
         sumKernel<T> <<< dimGrid, dimBlock >>> (m1,m2);
         ...
     }
 }

Then when we come to use Matrix from host code we template on the host sum implementation and never need to see any cuda specifics:

 #include "Matrix.h"
 #include "HostMatrixSum.h"

 Matrix<int, HostMatrixSum> m1 = Matrix<int>(...);
 Matrix<int, HostMatrixSum> m2 = Matrix<int>(...);
 Matrix<int, HostMatrixSum> result;
 Matrix.sum(m1,m2,result);

And if we're working on the gpu we can use the accelerated gpu implementation of sum:

 #include "Matrix.h"
 #include "GpuMatrixSum.cu.h"

 Matrix<int, GpuMatrixSum> m1 = Matrix<int>(...);
 Matrix<int, GpuMatrixSum> m2 = Matrix<int>(...);
 Matrix<int, GpuMatrixSum> result;
 Matrix.sum(m1,m2,result);

Hope that works for you!

like image 157
Matt Bond Avatar answered Oct 20 '22 20:10

Matt Bond


I had the same problem: error C2059: syntax error : '<'

First, I found a good set up/tutorial here (for visual express 2010 and cuda 4.0): http://www.stevenmarkford.com/installing-nvidia-cuda-with-visual-studio-2010/

and to solve the syntax error problem, this solved it: How do I start a CUDA app in Visual Studio 2010?

Specifically, changing the property of the *.cu file such that: Type is set to "CUDA C/C++"

Finally worked for me.

like image 21
biaspoint Avatar answered Oct 20 '22 21:10

biaspoint