For example cudaMemcpy and cuMemcpy? I can see that the function definitions are different, but I mean the API in general. Why is there an api starting with cu...
and one starting with cuda...
? When should each API be used?
CUDA (or Compute Unified Device Architecture) is a parallel computing platform and application programming interface (API) that allows software to use certain types of graphics processing units (GPUs) for general purpose processing, an approach called general-purpose computing on GPUs (GPGPU).
Figure 1 shows that the CUDA kernel is a function that gets executed on GPU. The parallel portion of your applications is executed K times in parallel by K different CUDA threads, as opposed to only one time like regular C/C++ functions. Figure 1. The kernel is a function executed on the GPU.
The API runtime platform enables the execution of the APIs. It enables the API to receive requests from apps or Web sites and send responses back. Most commonly, the API platform is an HTTP server, which allows exposing services via HTTP. HTTP is the common protocol for REST APIs.
The API where the method names start with cu...
is the so called Driver API. The API where the method names start with cuda...
is the Runtime API.
Originally (up to CUDA 3.0) the APIs have been completely separated. A rough classification was: The Runtime API is simpler and more conventient. The Driver API is intended for more complex, "low level" programming (and maybe library development).
Since CUDA 3.0, both APIs are interoperable. That means that, for example, when you allocate memory with the Driver API using cuMemAlloc
, then you can also use the same memory in Runtime API calls, like cudaMemcpy
.
The major practical difference was that in the Runtime API, you could use the special kernel<<<...>>>
launching syntax, whereas in the Driver API, you could load your CUDA programs as "modules" (with methods like cuModuleLoad
), given in form of CUBIN files or PTX files, and launch these kernels programmatically using cuLaunchKernel
.
In fact, I think that for the largest part of a CUDA program, the differences are negligible: Nearly every other functionality (except for kernel/module handling) is available in both APIs, and nearly equal in both. This refers to methods (cuMemcpy
and cudaMemcpy
etc., as well as to structures CU_event
and cudaEvent
etc.).
Further information can be found with websearches involving the keywords "CUDA Runtime Driver API", for example, at https://devtalk.nvidia.com/default/topic/522598/what-is-the-difference-between-runtime-and-driver-api-/
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With