What is "memory bound kernel and compute bound kernel in GPUs"?
Is this related to performance of GPUs?
Memory bound refers to a situation in which the time to complete a given computational problem is decided primarily by the amount of free memory required to hold the working data. This is in contrast to algorithms that are compute-bound, where the number of elementary computation steps is the deciding factor.
The kernel is a function executed on the GPU. Every CUDA kernel starts with a __global__ declaration specifier. Programmers provide a unique global ID to each thread by using built-in variables. Figure 2. CUDA kernels are subdivided into blocks.
Informally, the kernel is memory-bound if the most of kernel time is spent in executing memory instructions. In contrast, the kernel is compute-bound if most of operations are ALU-FPU instructions. GPUs have high memory and compute bandwidth and can be suitable for both categories. The terms are used for categorization and to indicate which optimization techniques may improve the performance of application significantly.
There are different optimization tips for the workloads of each category.
For example, for memory-bound workloads:
For compute-bound workloads:
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With