Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Maximum blocks per grid:CUDA

Tags:

What is the maximum number of blocks in a grid that can created per kernel launch? I am slightly confused here since

Now the compute capability table here says that there can be 65535 blocks per grid dimemsion in CUDA compute capability 2.0.

Does that mean the total number of blocks = 65535*65535?

Or does it mean that you can rearrange at most 65535 into a 1d grid of 65536 blocks or 2d grid of sqrt(65535) * sqrt(65535) ?

Thank you.

like image 726
smilingbuddha Avatar asked May 18 '11 17:05

smilingbuddha


People also ask

What is the maximum number of blocks supported by CUDA?

Theoretically, you can have 65535 blocks per dimension of the grid, up to 65535 * 65535 * 65535.

What is the maximum number of simultaneous blocks that will run on a single SM?

Each SM can have upto 16 active blocks on Kepler and 8 active blocks on Fermi. Also you need to think in terms of warps.

How many blocks are in a GPU?

Hardwire limits the number of blocks in a single launch to 65,535. Hardwire also limits the number of threads per block with which we can launch a kernel. – For many GPUs, maxThreadsPerBlock = 512 (or 1024, version 2.

What is CUDA block size?

Each CUDA card has a maximum number of threads in a block (512, 1024, or 2048).


1 Answers

65535 per dimension of the grid. On compute 1.x cards, 1D and 2D grids are supported. On compute 2.x cards, 3D grids are also supported, so 65535, 65535 x 65535, and 65535 x 65535 x 65535 are the limits for Fermi (compute 2.x) cards.

like image 72
talonmies Avatar answered Sep 29 '22 21:09

talonmies