What is the context switching mechanism in GPU?

Tags:

As I know, GPUs switch between warps to hide the memory latency. But I wonder in which condition, a warp will be switched out? For example, if a warp perform a load, and the data is there in the cache already. So is the warp switched out or continue the next computation? What happens if there are two consecutive adds? Thanks

243

asked Jul 07 '11 03:07

Zk1001

1 Answers

First of all, once a thread block is launched on a multiprocessor (SM), all of its warps are resident until they all exit the kernel. Thus a block is not launched until there are sufficient registers for all warps of the block, and until there is enough free shared memory for the block.

So warps are never "switched out" -- there is no inter-warp context switching in the traditional sense of the word, where a context switch requires saving registers to memory and restoring them.

The SM does, however, choose instructions to issue from among all resident warps. In fact, the SM is more likely to issue two instructions in a row from different warps than from the same warp, no matter what type of instruction they are, regardless of how much ILP (instruction-level parallelism) there is. Not doing so would expose the SM to dependency stalls. Even "fast" instructions like adds have a non-zero latency, because the arithmetic pipeline is multiple cycles long. On Fermi, for example, the hardware can issue 2 or more warp-instructions per cycle (peak), and the arithmetic pipeline latency is ~12 cycles. Therefore you need multiple warps in flight just to hide arithmetic latency, not just memory latency.

In general, the details of warp scheduling are architecture dependent, not publicly documented, and pretty much guaranteed to change over time. The CUDA programming model is independent of the scheduling algorithm, and you should not rely on it in your software.

106

answered Sep 29 '22 03:09

harrism

Related questions
                            
                                R - problem with foreach %dopar% inside function called by optim
                            
                                Storing files in database Vs file system
                            
                                Looking for the Code Converter which converts C# to Java [closed]
                            
                                Why can the as operator be used with Nullable<T>?
                            
                                Confused with all the Node JS frameworks/libraries etc. around [closed]
                            
                                LINQ query to perform a projection, skipping or wrapping exceptions where source throws on IEnumerable.GetNext()
                            
                                Netbeans console does not display Bangla unicode characters
                            
                                Is there a list of supported fonts for Safari Mobile (ie. iPad and iPhone)?
                            
                                mongo dbname --eval 'db.collection.find()' does not work
                            
                                readonly-fields as targets from subclass constructors
                            
                                Java - Cast a Map
                            
                                MySQL Database normalization

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With