Is it possible to compile (C++) code for the GPU with nvcc into a shared object (.so file) and load it dynamically from a C++ program (in this case, Cern's ROOT, which is essentially a C++ interpreter ("CINT")).
A simple example that I would like to run is:
extern "C"
void TestCompiled() {
printf("test\n");
exit(0);
}
This code was compiled with nvcc --compiler-options '-fPIC' -o TestCompiled_C.so --shared TestCompiled.cu
. Loading the shared object into ROOT with:
{ // Test.C program
int error, check;
check = gROOT->LoadMacro("TestCompiled_C.so", &error);
cout << "check " << check << " " << " error: " << error << endl;
TestCompiled(); // run macro
exit(0);
}
loads the library OK, but does not find TestCompiled()
:
$ root -b -l Test.C
root [0]
Processing Test.C...
check 0 error: 0
Error: Function Hello() is not defined in current scope Test.C:11:
*** Interpreter error recovered ***
Doing the same by compiling the first test script with ROOT (without the extern
line, compiling with root TestCompiled.C++
) works… What can I try in order to make the C++ program find the test function when nvcc does the compilation?
Using the CUDA Toolkit you can accelerate your C or C++ applications by updating the computationally intensive portions of your code to run on GPUs. To accelerate your applications, you can call functions from drop-in libraries as well as develop custom applications using languages including C, C++, Fortran and Python.
No, you cannot compile using GPUs. Nearly all compiled languages that provide binaries (basically C or C++) are made to run code on the CPU, not the GPU. These two architectures are fundamentally different, and specialize in very different things.
NVIDIA's CUDA CompilerEach CUDA program is a combination of host code written in C/C++ standard semantics with some extensions within CUDA API as well as the GPU device kernel functions.
I am assuming that the shared object file being output is like any other shared library, such as one created with GCC using the shared option. In this case, to load the object dynamically, you will need to use the dlopen
function to get a handle to the shared object. Then, you can use the dlsym
function to look for a symbol in the file.
void *object_handle = dlopen("TestCompiled_C.so", RTLD_NOW);
if (object_handle == NULL)
{
printf("%s\n", dlerror());
// Exit or return error code
}
void *test_compiled_ptr = dlsym(object_handle, "TestCompiled");
if (!test_compiled)
{
printf("%s\n", dlerror());
// Exit or return error code
}
void (*test_compiled)() = (void (*)()) test_compiled_ptr;
test_compiled();
You will need to include dlfcn.h
and link with -ldl
when you compile.
The difference between this and what you are doing now is that you are loading the library statically rather that dynamically. Even though shared objects are "dynamically linked libraries," as they are called in the windows world, doing it the way you are now is loading all of the symbols in the object when the program is launched. To dynamically load certain symbols at runtime, you need to do it this way.
I'm copying, for reference, the salient points of the answer from the RootTalk forum that solved the problem:
A key point is that the C interpreter of ROOT (CINT) requires a "CINT dictionary" for the externally compiled function. (There is no problem when compiling through ROOT, because ACLiC creates this dictionary when it pre-compiles the macro [root TestCompiled.C++
]).
So, an interface TestCompiled.h++
must be created:
#ifdef __cplusplus
extern "C" {
#endif
void TestCompiled(void);
#ifdef __cplusplus
} /* end of extern "C" */
#endif
The interface must then be loaded inside ROOT along with the shared object:
{ // Test.C ROOT/CINT unnamed macro (interpreted)
Int_t check, error;
check = gROOT->LoadMacro("TestCompiled_C.so", &error);
std::cout << "_C.so check " << check << " error " << error << std::endl;
check = gROOT->LoadMacro("TestCompiled.h++", &error);
std::cout << "_h.so check " << check << " error " << error << std::endl;
TestCompiled(); // execute the compiled function
}
ROOT can now use the externally compiled program: root -b -l -n -q Test.C
works.
This can be tested with, e.g., g++ on the following TestCompiled.C
:
#include <cstdio>
extern "C" void TestCompiled(void) { printf("test\n"); }
compiled with
g++ -fPIC -shared -o TestCompiled_C.so TestCompiled.C
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With