copy to the shared memory in cuda

Tags:

cuda

In CUDA programming, if we want to use shared memory, we need to bring the data from global memory to shared memory. Threads are used for transferring such data.

I read somewhere (in online resources) that it is better not to involve all the threads in the block for copying data from global memory to shared memory. Such idea makes sense that all the threads are not executed together. Threads in a warp execute together. But my concern is all the warps are not executed sequentially. Say, a block with threads is divided into 3 warps: war p0 (0-31 threads), warp 1 (32-63 threads), warp 2 (64-95 threads). It is not guaranteed that warp 0 will be executed first (am I right?).

So which threads should I use to copy the data from global to shared memory?

894

asked Mar 18 '13 00:03

user2026934

1 Answers

To use a single warp to load a shared memory array, just do something like this:

__global__
void kernel(float *in_data)
{
    __shared__ float buffer[1024];

    if (threadIdx.x < warpSize) {
        for(int i = threadIdx; i  <1024; i += warpSize) {
            buffer[i] = in_data[i];
        }
    }
    __syncthreads();

    // rest of kernel follows
}

[disclaimer: written in browser, never tested, use at own risk]

The key point here is the use of __syncthreads() to ensure that all threads in the block wait until the warp performing the load to shared memory have finished the load. The code I posted used the first warp, but you can calculate a warp number by dividing the thread index within the block by the warpSize. I also assumed a one-dimensional block, it is trivial to compute the thread index in a 2D or 3D block, so I leave that as an exercise to the reader.

answered Oct 20 '22 17:10

talonmies

Related questions
                            
                                HIVE: How does 'LIMIT' on 'SELECT * from' work under-the-hood?
                            
                                Calculating memory size based on address bit-length and memory cell contents
                            
                                JVM heap sizing in VMware guests
                            
                                SQL Server CLR Memory Allocation
                            
                                Memory Allocation for a Map with a fixed number of insertions
                            
                                JVM - Heap and Stack
                            
                                Is it possible to use short ints (16-bit) in PHP?
                            
                                Best Practices for PIC18 Stack/Memory Management?
                            
                                Is there a cost to using NSLog liberally?
                            
                                Memory-efficient Java library to read Excel files?
                            
                                PHP: memory_get_peak_usage(false), when should i use true?
                            
                                What does PHP assignment operator do?
                            
                                How would I discover the suppressed error in Valgrind?
                            
                                XCode: Commit failed in SVN : No space left on device
                            
                                Realloc() does not correctly free memory in Windows
                            
                                In Javascript, should I delete previous level's instances after loading a new one?
                            
                                Optimizing memory layout of class instances in C++
                            
                                memory access vs. memory copy
                            
                                Tools for checking memory fragmentation [closed]
                            
                                C - freeing memory allocated in function

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With