L2 cache in Kepler

Question

How does L2 cache work in GPUs with Kepler architecture in terms of locality of references? For example if a thread accesses an address in global memory, supposing the value of that address is not in L2 cache, how is the value being cached? Is it temporal? Or are other nearby values of that address brought to L2 cache too (spatial)?

Below picture is from NVIDIA whitepaper.

Picture is from NVIDIA whitepaper

Below picture is from NVIDIA whitepaper.

Picture is from NVIDIA whitepaper

Sagar Masuti · Accepted Answer

Unified L2 cache was introduced with compute capability 2.0 and higher and continues to be supported on the Kepler architecture. The caching policy used is LRU (least recently used) the main intention of which was to avoid the global memory bandwidth bottleneck. The GPU application can exhibit both types of locality (temporal and spatial).

Whenever there is an attempt read a specific memory it looks in the cache L1 and L2 if not found, then it will load 128 byte from the cache line. This is the default mode. The same can be understood from the below diagram as to why the 128 bit access pattern gives the good result.

enter image description here

L2 cache in Kepler

Tags:

caching

cuda

gpu

nvidia

Farzad

1 Answers

Sagar Masuti

Recent Activity

Donate For Us

L2 cache in Kepler

Tags:

caching

cuda

gpu

nvidia

Farzad

1 Answers

Sagar Masuti

Related questions

Recent Activity

Donate For Us