Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Write multiple kernels or a Single kernel

Tags:

opencl

Suppose that I've two big functions. Is it better to write them in a separate kernels and call them sequentially, or is better to write only one kernel? (I don't want to read the data back and force form between host and device in between). What about the speed up if I want to call the kernel many times?

like image 954
Damoon Avatar asked Feb 29 '12 18:02

Damoon


2 Answers

One thing to consider is the effect of register pressure on hardware utilization and performance.

As a general rule, big kernels have big register footprints. Typical OpenCL devices (ie. GPUs) have very finite register file sizes and large kernels can result in lower concurrency (fewer concurrent warps/wavefronts), less opportunities for latency hiding, and poorer overall performance. On the other hand, kernel launch overheads are pretty low on most platforms, so if your algorithm doesn't have an enormous amount of state to save between "phases" of execution, the penalty of using multiple kernels can be rather low.

Using multiple kernels also has another side benefit -- you get implicit synchronization between all work units for free. Often that can eliminate the need for atomic memory operations and synchronization primitives which can have a negative impact on code performance.

The ultimate guide should be measured performance. There is no universal rule-of-thumb for this sort of things. Benchmarking is the only way to know for sure.

like image 73
talonmies Avatar answered Jan 01 '23 21:01

talonmies


In general this is a question of (maybe) slightly better performance vs. readibility of your code. Copying buffers is no issue as long as you keep them within the same context. E.g. you could set one output buffer of a kernel to be an input buffer of the next kernel, which would not involve any copying.

like image 31
rdoubleui Avatar answered Jan 01 '23 23:01

rdoubleui