Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Lambda expressions with CUDA

If I use thrust::transform on thrust::host, the lambda usage is fine

thrust::transform(thrust::host, a, a+arraySize,b,d,[](int a, int b)->int
{
    return a + b;
});

However, if I change thrust::host to thrust::device, the code wouldn't pass the compiler. Here is the error on VS2013:

The closure type for a lambda ("lambda [](int, int)->int") cannot be used in the template argument type of a __global__ function template instantiation, unless the lambda is defined within a __device__ or __global__ function

So, the problem is how using __device__ or __global__ in connection to device lambdas.

like image 956
spiritsaway Avatar asked May 25 '15 12:05

spiritsaway


2 Answers

In CUDA 7 it is not possible. Quoting from Mark Harris:

That isn't supported today in CUDA, because the lambda is host code. Passing lambdas from host to device is a challenging problem, but it is something we will investigate for a future CUDA release.

What you can do in CUDA 7 is call thrust algorithms from your device code, and in that case you can pass lambdas to them...

With CUDA 7, thrust algorithms can be called from device code (e.g. CUDA kernels, or __device__ functors). In those situations, you can use (device) lambdas with thrust. An example is given in the parallelforall blog post here.

However, CUDA 7.5 introduces an experimental device lambda feature. This feature is described here:

CUDA 7.5 introduces an experimental feature: GPU lambdas. GPU lambdas are anonymous device function objects that you can define in host code, by annotating them with a __device__ specifier.

In order to enable compilation for this feature, (currently, with CUDA 7.5) it's necessary to specify --expt-extended-lambda on the nvcc compile command line.

like image 83
Robert Crovella Avatar answered Oct 13 '22 22:10

Robert Crovella


This simple code using device lambdas work under CUDA 8.0 RC, although device lambdas for this version of CUDA are still at an experimental stage:

#include <thrust/device_vector.h>
#include <thrust/functional.h>
#include <thrust/transform.h>

using namespace thrust::placeholders;

int main(void)
{
    // --- Input data 
    float a = 2.0f;
    float x[4] = { 1, 2, 3, 4 };
    float y[4] = { 1, 1, 1, 1 };

    thrust::device_vector<float> X(x, x + 4);
    thrust::device_vector<float> Y(y, y + 4);

    thrust::transform(X.begin(), 
                      X.end(),  
                      Y.begin(), 
                      Y.begin(),
                      [=] __host__ __device__ (float x, float y) { return a * x + y; }      // --- Lambda expression 
                     );        

    for (size_t i = 0; i < 4; i++) std::cout << a << " * " << x[i] << " + " << y[i] << " = " << Y[i] << std::endl;

    return 0;
}

Remember to use

--expt-extended-lambda

for compilation.

like image 8
Vitality Avatar answered Oct 13 '22 23:10

Vitality