What is "memory bound kernel and compute bound kernel in GPUs"? Is this related to performance of GPUs?

Informally, the kernel is memory-bound if the most of kernel time is spent in executing memory instructions. In contrast, the kernel is compute-bound if most of operations are ALU-FPU instructions. GPUs have high memory and compute bandwidth and can be suitable for both categories. The terms are used for categorization and to indicate which optimization techniques may improve the performance of application significantly. There are different optimization tips for the workloads of each category. For example, for memory-bound workloads: <ul> <li>Exploit shared memory</li> <li>Memory access coalescing</li> <li>Memory camping.</li> </ul> For compute-bound workloads: <ul> <li>Reduce branch divergence</li> <li>Interleave computation between ALU-FPU and SFU</li> <li>Provide enough independent instructions to exploiting ILP.</li> </ul>

memory bound kernel and compute bound kernel in GPUs

What is "memory bound kernel and compute bound kernel in GPUs"?

Is this related to performance of GPUs?

What is compute-bound and memory bound?

Memory bound refers to a situation in which the time to complete a given computational problem is decided primarily by the amount of free memory required to hold the working data. This is in contrast to algorithms that are compute-bound, where the number of elementary computation steps is the deciding factor.

What are kernels in GPU?

The kernel is a function executed on the GPU. Every CUDA kernel starts with a __global__ declaration specifier. Programmers provide a unique global ID to each thread by using built-in variables. Figure 2. CUDA kernels are subdivided into blocks.

Informally, the kernel is memory-bound if the most of kernel time is spent in executing memory instructions. In contrast, the kernel is compute-bound if most of operations are ALU-FPU instructions. GPUs have high memory and compute bandwidth and can be suitable for both categories. The terms are used for categorization and to indicate which optimization techniques may improve the performance of application significantly.

There are different optimization tips for the workloads of each category.

For example, for memory-bound workloads:

Exploit shared memory
Memory access coalescing
Memory camping.

For compute-bound workloads:

Reduce branch divergence
Interleave computation between ALU-FPU and SFU
Provide enough independent instructions to exploiting ILP.

memory bound kernel and compute bound kernel in GPUs

Tags:

kernel

gpu

nvidia

Divine Cosmos

People also ask

1 Answers

lashgar

Recent Activity

Donate For Us

memory bound kernel and compute bound kernel in GPUs

Tags:

kernel

gpu

nvidia

Divine Cosmos

People also ask

1 Answers

lashgar

Related questions

Recent Activity

Donate For Us