Prefetch in cuda (through C code)

2 Answers

According to PTX manual here is how prefetch works in PTX:

enter image description here

You can embed the PTX instructions into the CUDA kernel. Here is a tiny sample from NVIDIA's documentation:

__device__ int cube (int x)
{
  int y;
  asm("{\n\t"                       // use braces for local scope
      " .reg .u32 t1;\n\t"           // temp reg t1,
      " mul.lo.u32 t1, %1, %1;\n\t" // t1 = x * x
      " mul.lo.u32 %0, t1, %1;\n\t" // y = t1 * x
      "}"
      : "=r"(y) : "r" (x));
  return y;
}

You may come to conclude with the following prefetch function in C:

__device__ void prefetch_l1 (unsigned int addr)
{

  asm(" prefetch.global.L1 [ %1 ];": "=r"(addr) : "r"(addr));
}

NOTICE: You need the GPU of Compute Capability 2.0 or higher for prefetch. Pass proper compile flags accordingly -arch=sm_20

179

answered Oct 01 '22 05:10

lashgar

According to this thread, below is the code for different cache prefetching techniques:

#define DEVICE_STATIC_INTRINSIC_QUALIFIERS  static __device__ __forceinline__

#if (defined(_MSC_VER) && defined(_WIN64)) || defined(__LP64__)
#define PXL_GLOBAL_PTR   "l"
#else
#define PXL_GLOBAL_PTR   "r"
#endif

DEVICE_STATIC_INTRINSIC_QUALIFIERS void __prefetch_global_l1(const void* const ptr)
{
  asm("prefetch.global.L1 [%0];" : : PXL_GLOBAL_PTR(ptr));
}

DEVICE_STATIC_INTRINSIC_QUALIFIERS void __prefetch_global_uniform(const void* const ptr)
{
  asm("prefetchu.L1 [%0];" : : PXL_GLOBAL_PTR(ptr));
}

DEVICE_STATIC_INTRINSIC_QUALIFIERS void __prefetch_global_l2(const void* const ptr)
{
  asm("prefetch.global.L2 [%0];" : : PXL_GLOBAL_PTR(ptr));
}

answered Oct 01 '22 05:10

Serge Rogatch

Related questions
                            
                                How to define a CUDA shared memory with a size known at run time?
                            
                                Pinned memory in CUDA
                            
                                How to remove zero values from an array in parallel
                            
                                Opencv Error: no GPU support (library is compiled without CUDA support)
                            
                                CUDA syntax error '<'
                            
                                CUDA and gcc compatibility issue
                            
                                thrust::device_vector in constant memory
                            
                                CUDA: invalid device ordinal
                            
                                'inline' for __global__ functions to avoid multiple definition error
                            
                                Decode video in Cuda using a socket / memory instead of a file
                            
                                nvcc fatal : Cannot find compiler 'cl.exe' in PATH although Visual Studio 12.0 is added to PATH
                            
                                Using Eigen 3.3 in a CUDA kernel
                            
                                cudaDeviceReset for multiple gpu's
                            
                                CUDA: how to use thrust::sort_by_key directly on the GPU? [duplicate]
                            
                                cuFFT and streams
                            
                                Performance issues: Single CPU core vs Single CUDA core
                            
                                can't find cuda lib and include on ubuntu
                            
                                CUDA: What is scattered write?
                            
                                Locate CUDA installation on Linux
                            
                                How to provide Matlab with the old gcc version it wants?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Prefetch in cuda (through C code)

Tags:

cuda

prefetch

user1805482

People also ask

2 Answers

lashgar

Serge Rogatch

Recent Activity

Donate For Us