Historically and more properly, it is more than just one device/function—it is that combination of hardware or servers, software and management activities used to control communications between internal networks and external networks.
Asking for help, clarification, or responding to other answers.
In CUDA, the host refers to the CPU and its memory, while the device refers to the GPU and its memory. Code run on the host can manage memory on both the host and device, and also launches kernels which are functions executed on the device. These kernels are executed by many GPU threads in parallel.
Global functions are also called "kernels". It's the functions that you may call from the host side using CUDA kernel call semantics (<<<...>>>
).
Device functions can only be called from other device or global functions. __device__
functions cannot be called from host code.
Differences between __device__
and __global__
functions are:
__device__
functions can be called only from the device, and it is executed only in the device.
__global__
functions can be called from the host, and it is executed in the device.
Therefore, you call __device__
functions from kernels functions, and you don't have to set the kernel settings. You can also "overload" a function, e.g : you can declare void foo(void)
and __device__ foo (void)
, then one is executed on the host and can only be called from a host function. The other is executed on the device and can only be called from a device or kernel function.
You can also visit the following link: http://code.google.com/p/stanford-cs193g-sp2010/wiki/TutorialDeviceFunctions, it was useful for me.
__global__
- Runs on the GPU, called from the CPU or the GPU*. Executed with <<<dim3>>>
arguments.__device__
- Runs on the GPU, called from the GPU. Can be used with variabiles too.__host__
- Runs on the CPU, called from the CPU.*) __global__
functions can be called from other __global__
functions starting
compute capability 3.5.
I will explain it with an example:
main()
{
// Your main function. Executed by CPU
}
__global__ void calledFromCpuForGPU(...)
{
//This function is called by CPU and suppose to be executed on GPU
}
__device__ void calledFromGPUforGPU(...)
{
// This function is called by GPU and suppose to be executed on GPU
}
i.e. when we want a host(CPU) function to call a device(GPU) function, then 'global' is used. Read this: "https://code.google.com/p/stanford-cs193g-sp2010/wiki/TutorialGlobalFunctions"
And when we want a device(GPU) function (rather kernel) to call another kernel function we use 'device'. Read this "https://code.google.com/p/stanford-cs193g-sp2010/wiki/TutorialDeviceFunctions"
This should be enough to understand the difference.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With