The CUDA driver API provides loading the file containing PTX code from the filesystem. One usually does the following:
CUmodule module;
CUfunction function;
const char* module_file = "my_prg.ptx";
const char* kernel_name = "vector_add";
err = cuModuleLoad(&module, module_file);
err = cuModuleGetFunction(&function, module, kernel_name);
In case one generates the PTX files during runtime (on the fly) going through file IO seems to be a waste (since the driver has to load it back in again).
Is there a way to pass the PTX program to the CUDA driver directly (e.g. as a C string) ?
PTX is a low-level parallel-thread-execution virtual machine and ISA (Instruction Set Architecture). PTX can be output from multiple tools or written directly by developers. PTX is meant to be GPU-architecture independent, so that the same code can be reused for different GPU architectures.
Introduction. NVRTC is a runtime compilation library for CUDA C++. It accepts CUDA C++ source code in character string form and creates handles that can be used to obtain the PTX.
CUDA (or Compute Unified Device Architecture) is a parallel computing platform and application programming interface (API) that allows software to use certain types of graphics processing units (GPUs) for general purpose processing, an approach called general-purpose computing on GPUs (GPGPU).
Taken from the ptxjit
CUDA example:
Define the PTX program as a C string as
char myPtx32[] = "\n\
.version 1.4\n\
.target sm_10, map_f64_to_f32\n\
.entry _Z8myKernelPi (\n\.param .u32 __cudaparm__Z8myKernelPi_data)\n\
{\n\
.reg .u16 %rh<4>;\n\
.reg .u32 %r<8>;\n\
// Other stuff
.loc 28 18 0\n\
exit;\n\
}\n\
";
then
cuModuleLoadDataEx(phModule, myPtx32, 0, 0, 0);
and finally
cuModuleLoadDataEx(phModule, myPtx, 0, 0, 0);
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With