Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How are 2D / 3D CUDA blocks divided into warps?

If I start my kernel with a grid whose blocks have dimensions:

dim3 block_dims(16,16);

How are the grid blocks now split into warps? Do the first two rows of such a block form one warp, or the first two columns, or is this arbitrarily-ordered?

Assume a GPU Compute Capability of 2.0.

like image 674
Gabriel Avatar asked May 30 '11 13:05

Gabriel


People also ask

How many warps are in a block CUDA?

On the K20 GPUs on Stampede, each CUDA core may execute 32 threads simultaneously. Therefore, blocks are divided into warps of 32 threads for execution.

What is a warp in GPU?

In an NVIDIA GPU, the basic unit of execution is the warp. A warp is a collection of threads, 32 in current implementations, that are executed simultaneously by an SM. Multiple warps can be executed on an SM at once.

How many CUDA blocks are there?

Blocks can be organized into one, two or three-dimensional grids of up to 231-1, 65,535 and 65,535 blocks in the x, y and z dimensions respectively.

How do you determine the number of threads blocks and grid in CUDA call?

Choosing the number of threads per block is very complicated. Most CUDA algorithms admit a large range of possibilities, and the choice is based on what makes the kernel run most efficiently. It is almost always a multiple of 32, and at least 64, because of how the thread scheduling hardware works.


2 Answers

Threads are numbered in order within blocks so that threadIdx.x varies the fastest, then threadIdx.y the second fastest varying, and threadIdx.z the slowest varying. This is functionally the same as column major ordering in multidimensional arrays. Warps are sequentially constructed from threads in this ordering. So the calculation for a 2d block is

unsigned int tid = threadIdx.x + threadIdx.y * blockDim.x;
unsigned int warpid = tid / warpSize;

This is covered both in the programming guide and the PTX guide.

like image 84
talonmies Avatar answered Sep 16 '22 15:09

talonmies


To illustrate @talonmies's answer through 'Visual Studio WarpWatch' window for two consecutive warps (dim3 block_dims(16,16); and WarpSize = 32):

First Warp Second Warp

like image 41
Mohsen Avatar answered Sep 17 '22 15:09

Mohsen