Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

L2 cache in Kepler

How does L2 cache work in GPUs with Kepler architecture in terms of locality of references? For example if a thread accesses an address in global memory, supposing the value of that address is not in L2 cache, how is the value being cached? Is it temporal? Or are other nearby values of that address brought to L2 cache too (spatial)?

Below picture is from NVIDIA whitepaper.

Picture is from NVIDIA whitepaper

like image 491
Farzad Avatar asked Oct 28 '13 05:10

Farzad


1 Answers

Unified L2 cache was introduced with compute capability 2.0 and higher and continues to be supported on the Kepler architecture. The caching policy used is LRU (least recently used) the main intention of which was to avoid the global memory bandwidth bottleneck. The GPU application can exhibit both types of locality (temporal and spatial).

Whenever there is an attempt read a specific memory it looks in the cache L1 and L2 if not found, then it will load 128 byte from the cache line. This is the default mode. The same can be understood from the below diagram as to why the 128 bit access pattern gives the good result.

enter image description here

like image 115
Sagar Masuti Avatar answered Oct 13 '22 20:10

Sagar Masuti