Specifically, my issue is that I have CUDA code that needs <code><curand_kernel.h></code> to run. This isn't included by default in NVRTC. Presumably then when creating the program context (i.e. the call to <code>nvrtcCreateProgram</code>), I have to send in the name of the file (<code>curand_kernel.h</code>) and also the source code of <code>curand_kernel.h</code>? I feel like I shouldn't have to do that. It's hard to tell; I haven't managed to find an example from NVIDIA of someone needing standard CUDA files like this as a source, so I really don't understand what the syntax is. Some issues: <code>curand_kernel.h</code> also has includes... Do I have to do the same for each of these? I am not even sure the NVRTC compiler will even run correctly on <code>curand_kernel.h</code>, because there are some language features it doesn't support, aren't there? Next: if you've sent in the source code of a header file to <code>nvrtcCreateProgram</code>, do I still have to <code>#include</code> it in the code to be executed / will it cause an error if I do so? A link to example code that does this or something like it would be appreciated much more than a straightforward answer; I really haven't managed to find any.

You have to send the "filename" and the source of each header separately. When the preprocessor does its thing, it'll use any <code>#include</code> filenames as a key to find the source for the header, based on the collection that you provide. I suspect that, in this case, the compiler (driver) doesn't have file system access, so you have to give it the source in much the same way that you would for shader includes in OpenGL. So: <ul> <li>Include your header's name when calling <code>nvrtcCreateProgram</code>. The compiler will, internally, generate the equivalent of a <code>std::map<string,string></code> containing the source of each header indexed by the given name.</li> <li>In your kernel source, use <code>#include "foo.cuh"</code> as usual. </li> <li>The compiler will use <code>foo.cuh</code> as an index or key into its internal map (created when you called <code>nvrtcCreateProgram</code>), and will retrieve the header source from that collection</li> <li>Compilation proceeds as normal.</li> </ul> One of the reasons that nvrtc provides only a "subset" of features is that the compiler plays in a somewhat sandboxed environment, without necessarily having all of the supporting tools and utilities lying around that you have with offline compilation. So, you have to manually handle a lot of the stuff that the normal <code>nvcc + (gcc | MSVC| clang)</code> combination provides. A possible, but non-ideal, solution would be to preprocess the file that you need in your IDE, save the result and then <code>#include</code> that. However, I bet there is a better way to do that. if you just want <code>curand</code>, consider diving into the library and extracting the part you need (blech) or using another GPU-friendly <code>rand</code> implementation. On older CUDA versions, I just generated a big array of random floats on the host, uploaded it to the GPU, and sampled it in the kernels. This related link may be helpful.

How do you include standard CUDA libraries to link with NVRTC code?

Tags:

c

cuda

gpu

nvrtc

Specifically, my issue is that I have CUDA code that needs <curand_kernel.h> to run. This isn't included by default in NVRTC. Presumably then when creating the program context (i.e. the call to nvrtcCreateProgram), I have to send in the name of the file (curand_kernel.h) and also the source code of curand_kernel.h? I feel like I shouldn't have to do that.

It's hard to tell; I haven't managed to find an example from NVIDIA of someone needing standard CUDA files like this as a source, so I really don't understand what the syntax is. Some issues: curand_kernel.h also has includes... Do I have to do the same for each of these? I am not even sure the NVRTC compiler will even run correctly on curand_kernel.h, because there are some language features it doesn't support, aren't there?

Next: if you've sent in the source code of a header file to nvrtcCreateProgram, do I still have to #include it in the code to be executed / will it cause an error if I do so?

A link to example code that does this or something like it would be appreciated much more than a straightforward answer; I really haven't managed to find any.

588

asked Oct 17 '16 13:10

Billy Smith

1 Answers

You have to send the "filename" and the source of each header separately.

When the preprocessor does its thing, it'll use any #include filenames as a key to find the source for the header, based on the collection that you provide.

I suspect that, in this case, the compiler (driver) doesn't have file system access, so you have to give it the source in much the same way that you would for shader includes in OpenGL.

So:

Include your header's name when calling nvrtcCreateProgram. The compiler will, internally, generate the equivalent of a std::map<string,string> containing the source of each header indexed by the given name.
In your kernel source, use #include "foo.cuh" as usual.
The compiler will use foo.cuh as an index or key into its internal map (created when you called nvrtcCreateProgram), and will retrieve the header source from that collection
Compilation proceeds as normal.

One of the reasons that nvrtc provides only a "subset" of features is that the compiler plays in a somewhat sandboxed environment, without necessarily having all of the supporting tools and utilities lying around that you have with offline compilation. So, you have to manually handle a lot of the stuff that the normal nvcc + (gcc | MSVC| clang) combination provides.

A possible, but non-ideal, solution would be to preprocess the file that you need in your IDE, save the result and then #include that. However, I bet there is a better way to do that. if you just want curand, consider diving into the library and extracting the part you need (blech) or using another GPU-friendly rand implementation. On older CUDA versions, I just generated a big array of random floats on the host, uploaded it to the GPU, and sampled it in the kernels.

This related link may be helpful.

answered Sep 17 '22 14:09

3Dave

Related questions
                            
                                In C or C++ can I impose restrictions as to which files can include my header file
                            
                                Why is there a nested pointer inside a C struct definition?
                            
                                Valgrind missing error
                            
                                Which is more useful at an assembly level, 64 registers or three operand instructions? [closed]
                            
                                How to force a running program to flush the contents of its I/O buffers to disk with external means?
                            
                                Reading pattern from file and create a bmp image of that in C
                            
                                Is this allowed to call functions with different prototypes by a pseudo-generic function pointer?
                            
                                How to check if record is present in sqlite in C
                            
                                C: Why can you pass (to a function) a struct by value, but not an array?
                            
                                c sendto function sets “network is unreachable” errno in linux2.6.29
                            
                                How to set breakpoint in gdb when attach to another process
                            
                                Initializing floating point variable with large literal
                            
                                How to set language standard (-std) for Clang static analyzer in Qt Creator
                            
                                Does a function have any storage class in C Language?
                            
                                Display an image from the Linux console
                            
                                gdb debugging (with breakpoint): Gtk-WARNING **: Invalid text buffer iterator
                            
                                How to remove all nodes from a POSIX binary (tsearch) tree?
                            
                                pointer to array of integers and normal array of integers
                            
                                Set C11 as default Language in Clion
                            
                                Two-part for loop in C [duplicate]

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With