Does modern GPU (e.g Fermi/Evergreen) supports out of order execution?

Question

I am writing an OpenCL kernel which involves a few barriers in a loop. I have tested the kernel on CPU (8-core FX8150) and the result shows these barriers reduced running speed by a factor of 50~100 times (I further verified this by re-implementing the kernel on Java using multi-threading + CyclicBarrier). I suspect the reason was barrier essentially stops the CPU taking advantage of out-of-order execution, so I am a little worried if I would observe the same magnitude of speed decrease on GPU. I checked a few official documents and googled around a bit but there is little information available on this topic.

lashgar · Accepted Answer

Current state-of-the art GPUs are in-order pipelined processor. GPUs fill the pipeline effectively by interleaving instructions from different warps (wavefronts). In comparisons, CPUs use out-of-order speculative execution to fill the pipeline. There are different functional units like ALUs and SFUs which have separated pipelines. But notice that instruction dependency stalls the warp. For more information on instruction dependency resolving on GPUs refer to this NVIDIA patent.

huseyin tugrul buyukisik · Answer

NVIDIA’s Next Generation
CUDA Compute and Graphics Architecture, Code-Named “Fermi”:

Nvidia GigaThread Engine has capabilities of(at page 5)

10x faster application context switching
Concurrent kernel execution
Out of Order thread block execution :)
Dual overlapped memory transfer engines

Evergreen has SIMD capabilities and has a chance outperform some fermi but i dont know about oooe of it. There is also "local atomic add" upper hand of HD 7000 series compared to GTX 600 series (nearly 10x faster)

Does modern GPU (e.g Fermi/Evergreen) supports out of order execution?

Tags:

parallel-processing

cpu

gpu

aaronqli

2 Answers

lashgar

huseyin tugrul buyukisik

Recent Activity

Donate For Us

Does modern GPU (e.g Fermi/Evergreen) supports out of order execution?

Tags:

parallel-processing

cpu

gpu

aaronqli

2 Answers

lashgar

huseyin tugrul buyukisik

Related questions

Recent Activity

Donate For Us