Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

CUDA constant memory banks

When we check the register usage by using xptxas we see something like this:

ptxas info : Used 63 registers, 244 bytes cmem[0], 51220 bytes cmem[2], 24 bytes cmem[14], 20 bytes cmem[16]

I wonder if currently there is any documentation that clearly explains cmem[x]. What is the point of separating constant memory into multiple banks, how many banks are there in total, and what are other banks other than 0, 2, 14, 16 used for?

as a side note, @njuffa (special thanks to you) previously explained on nvidia's forum what is bank 0,2,14,16:

Used constant memory is partitioned in constant program ‘variables’ (bank 1), plus compiler generated constants (bank 14).

cmem[0]:kernel arguments

cmem[2]:user defined constant objects

cmem[16]:compiler generated constants (some of which may correspond to literal constants in the source code)

like image 643
biubiuty Avatar asked Sep 05 '12 22:09

biubiuty


1 Answers

The usage of GPU constant banks by CUDA is not officially documented to my knowledge. The number and usage of constant banks does differ between GPU generations. These are low-level implementation details that programmers do not have to worry about.

The usage of constants banks can be reversed engineered, if so desired, by looking at the machine code (SASS) generated for a given platform. In fact, this is how I came up with the information cited in the original question (this information came from an NVIDIA developer forum post of mine). As I recall, the information I gave there was based on adhoc reverse engineering specifically applied to Fermi-class devices, but I am unable to verify this at this time as the forums are inaccessible at the moment.

One reason for having multiple constant banks is to reserve the user visible constant memory for the use of CUDA programmers, while storing additional read-only information provided by hardware or tools in additional constant banks.

Note that the CUDA math library is provided as source files and the functions get inlined into user code, therefore constant memory usage of CUDA math library functions is included in the statistics for the user-visible constant memory.

like image 160
njuffa Avatar answered Oct 22 '22 01:10

njuffa