Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

physical memory on AMD devices: local vs private

I'm writing an algorithm in OpenCL in which I'd need every work unit to remember a fair portion of data, say something between a long[70] and a long[200] or so per kernel.

Recent AMD devices have 32 KiB __local memory, which is (for the given amount of data per kernel) enough to store the info for 20-58 work units. However, from what I understand from the architecture (and especially from this drawing), each shader core also has a dedicated amount of private memory. I however fail to find its size.

Can anyone tell me how to find out how much private memory each kernel has?

I'm particularly curious about the HD7970, since I plan to buy some of these soon.

Edit: Problem solved, the answer is here in appendix D.

like image 227
user1111929 Avatar asked Feb 17 '12 16:02

user1111929


1 Answers

The answer was given by user talonmies in the comments, so I'll write it in a new answer here to close the question.

These values can be found in Appendix D of the AMD APP OpenCL Programming Guide http://developer.amd.com/sdks/amdappsdk/assets/amd_accelerated_parallel_processing_opencl_programming_guide.pdf (a similar document exists for nVidia). Apparently a register is 128 bits (4x32) for AMD devices and there are 16384 registers for all modern high-end devices, so that's a remarkable 256KB per compute unit.

like image 103
user1111929 Avatar answered Nov 15 '22 16:11

user1111929