Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is restrict(amp) more restrictive than CUDA kernel code?

In C++ AMP, kernel functions or lambdas are marked with restrict(amp), which imposes severe restrictions on the allowed subset of C++ (listed here). Does CUDA allow any more freedom on the subset of C or C++ in kernel functions?

like image 500
Eugene Avatar asked Mar 12 '12 20:03

Eugene


1 Answers

As of Visual Studio 11 and CUDA 4.1, restrict(amp) functions are more restrictive than CUDA's analogous __device__ functions. Most noticeably, AMP is more restrictive about how pointers can be used. This is a natural consequence of AMP's DirectX11 computational substrate, which disallows pointers in HLSL (graphics shader) code. By constrast, CUDA's lower-level IR is PTX, which is more general purpose than HLSL.

Here's a line by line comparison:

| VS 11 AMP restrict(amp) functions     | CUDA 4.1 sm_2x __device__ functions  |
|------------------------------------------------------------------------------|
|* can only call functions that have    |* can only call functions that have   |
|  the restrict(amp) clause             |  the __device__ decoration           |
|* The function must be inlinable       |* need not be inlined                 |
|* The function can declare only        |* Class types are allowed             |
|  POD variables                        |                                      |
|* Lambda functions cannot              |* Lambdas are not supported, but      |
|  capture by reference and             |  user functors can hold pointers     |
|  cannot capture pointers              |                                      |
|* References and single-indirection    |* References and multiple-indirection |
|  pointers are supported only as       |  pointers are supported              |
|  local variables and function         |                                      |
|* No recursion                         |* Recursion OK                        |
|* No volatile variables                |* Volatile variables OK               |
|* No virtual functions                 |* Virtual functions OK                |
|* No pointers to functions             |* Pointers to functions OK            |
|* No pointers to member functions      |* Pointers to member functions OK     |
|* No pointers in structures            |* Pointers in structures OK           |
|* No pointers to pointers              |* Pointers to pointers OK             |
|* No goto statements                   |* goto statements OK                  |
|* No labeled statements                |* Labeled statements OK               |
|* No try, catch, or throw statements   |* No try, catch, or throw statements  |
|* No global variables                  |* Global __device__ variables OK      |
|* Static variables through tile_static |* Static variables through __shared__ |
|* No dynamic_cast                      |* No dynamic_cast                     |
|* No typeid operator                   |* No typeid operator                  |
|* No asm declarations                  |* asm declarations (inline PTX) OK    |
|* No varargs                           |* No varargs                          |

You can read more about restrict(amp)'s restrictions here. You can read about C++ support in CUDA __device__ functions in Appendix D of the CUDA C Programming Guide.

like image 187
Jared Hoberock Avatar answered Sep 28 '22 07:09

Jared Hoberock