Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Creating a static CUDA library to be linked with a C++ program

Tags:

c++

cuda

I am attempting to link a CUDA kernel with a C++ autotools project however cannot seem to pass the linking stage.

I have a file GPUFloydWarshall.cu that contains the kernel and a wrapper C function that I would like place into a library libgpu.a. This will be consistent with the remainder of the project. Is this at all possible?

Secondly, the library would then need to be linked to around ten other libraries for the main executable which at the moment using mpicxx.

Currently I am using/generating the below commands to compile and create the libgpu.a library

nvcc   -rdc=true -c -o temp.o GPUFloydWarshall.cu
nvcc -dlink -o GPUFloydWarshall.o temp.o -L/usr/local/cuda/lib64 -lcuda -lcudart
rm -f libgpu.a
ar cru libgpu.a GPUFloydWarshall.o
ranlib libgpu.a

When this is all linked into the main executable I get the following error

problem/libproblem.a(libproblem_a-UTRP.o): In function `UTRP::evaluate(Solution&)':
UTRP.cpp:(.text+0x1220): undefined reference to `gpu_fw(double*, int)'

Th gpu_fw function is my wrapper function.

like image 882
Matt John Avatar asked Nov 12 '14 18:11

Matt John


1 Answers

Is this at all possible?

Yes, it's possible. And creating a (non-CUDA) wrapper function around it makes it even easier. You can make your life easier still if you rely on C++ linking throughout (you mention a wrapper C function). mpicxx is a C++ compiler/linker alias, and cuda files (.cu) follow C++ compiler/linker behavior by default. Here's a very simple question that discusses building cuda code (encapsulated in a wrapper function) into a static library.

Secondly, the library would then need to be linked to around ten other libraries for the main executable which at the moment using mpicxx.

Once you have a C/C++ (non-CUDA) wrapper exposed in your library, linking should be no different than ordinary linking of ordinary libraries. You may still need to pass the cuda runtime libraries and any other cuda libraries you may be using in the link step, but this is the same conceptually as any other libraries your project may depend on.

EDIT:

It's not clear you need to use device linking for what you want to do. (But it's acceptable, it just complicates things a bit.) Anyway, your construction of the library is not quite correct, now that you have shown the command sequence. The device link command produces a device-linkable object, that does not include all necessary host pieces. To get everything in one place, we want to add both GPUFloydWarshall.o (which has the device-linked pieces) AND temp.o (which has the host code pieces) to the library.

Here's a fully worked example:

$ cat GPUFloydWarshall.cu
#include <stdio.h>

__global__ void mykernel(){
  printf("hello\n");
}

void gpu_fw(){
  mykernel<<<1,1>>>();
  cudaDeviceSynchronize();
}


$ cat main.cpp
#include <stdio.h>

void gpu_fw();

int main(){

  gpu_fw();
}

$ nvcc   -rdc=true -c -o temp.o GPUFloydWarshall.cu
$ nvcc -dlink -o GPUFloydWarshall.o temp.o -lcudart
$ rm -f libgpu.a
$ ar cru libgpu.a GPUFloydWarshall.o temp.o
$ ranlib libgpu.a
$ g++ main.cpp -L. -lgpu -o main -L/usr/local/cuda/lib64 -lcudart
$ ./main
hello
$
like image 151
Robert Crovella Avatar answered Oct 03 '22 09:10

Robert Crovella