Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to use L2 Cache in CUDA

Tags:

cuda

nvidia

I have searched other threads on usage of L2 cache in CUDA. But, unable to find the solution. How do i make use of L2 Cache? Is there any invoking function or declaration for its use? Like, for using shared memory, we use __device__ __shared__. Is there anything like that for L2 Cache??

like image 795
Fr34K Avatar asked Dec 13 '25 12:12

Fr34K


1 Answers

The L2 cache is transparent to device code. All accesses to memory (global, local, surface, texture, constant, and instruction) that do not hit in L1 go to L2. All writes go through L2.

CUDA C Programming Guide F.4.2 : Global Memory

This sections provides a few more details on L2.

The compiler flag -dlcm=cg can be used to make global accesses be uncached in L1 and cached in L2.

CUDA C Programming Guide B.5 : Memory Fence Functions

The function __threadfence() can be used to make sure that all writes to global memory are visible in L2.

The function __threadfence_system() can be used to make sure that all writes to global memory are visible to host threads.

like image 85
Greg Smith Avatar answered Dec 15 '25 06:12

Greg Smith



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!