Why is this a conflict-free memory bank access?

Tags:

Here is an image taken from the CUDA C Programming Guide:

enter image description here

The guide says that this is an example of a Conflict-free access since threads 3, 4, 6, 7 and 9 access the same word within bank 5.

I don't quite understand why is this conflict-free, since not only threads 3, 4, 6, 7 and 9 access the same work within same bank (shouldn't that be an example of memory conflict?) but also thread 5 has to access bank 4.

Could you please explain to me this case?

730

asked Mar 19 '14 17:03

syntagma

1 Answers

Note that a bank is not the same thing as a word or location in shared memory. A bank refers collectively to all words in shared memory that satisfy a certain address pattern condition.

In general, shared memory bank conflicts can be avoided if all accesses from a warp (or half-warp in cc 1.x) go to separate banks. These accesses need not be in warp order, i.e. they can be scrambled, as long as the request from each thread targets a separate bank.

The above description covers every arrow in your diagram except those arrows pointing to bank 5.

If we had no other information, then multiple arrows targetting a single bank would indicate a potential bank conflict.

However, there is an exception, when not only are the accesses targetting the same bank, but they are targetting the same word in memory. When multiple shared memory requests target the same word in memory, then the shared memory system has a broadcast mechanism to take the data contained in that word, and service it to all the requesting threads, in a single cycle.

From the documentation(http://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#shared-memory-1-x):

Shared memory features a broadcast mechanism whereby a 32-bit word can be read and broadcast to several threads simultaneously when servicing one memory read request. This reduces the number of bank conflicts when several threads read from an address within the same 32-bit word.

122

answered Sep 20 '22 03:09

Robert Crovella

Related questions
                            
                                Is the address of a variable in C the real address in the RAM of the computer?
                            
                                How are JavaScript arrays stored in memory
                            
                                Substring Without any Allocation Using Span<T>
                            
                                How to efficiently create a large vector of items initialized to the same value?
                            
                                How do I track down a mod_perl memory leak?
                            
                                concurrent write to same memory address
                            
                                What is the purpose of each of the memory locations, stack, heap, etc? (lost in technicalities)
                            
                                PHP class objects and memory usage
                            
                                60 hz NSTimer and autoreleased memory
                            
                                Contents of a programs memory (Mac)
                            
                                Memory Protection without MMU
                            
                                Android heap size limit, do we still really need to design applications with a 16 MB limit in mind?
                            
                                How to Free Memory when Out-of-memory exception occurs in Delphi using SetLength
                            
                                Is C++ memory alignment correct or inefficient?
                            
                                MongoDB preload documents into RAM for better performance
                            
                                Reading memory from ".exe" + offset?
                            
                                VS2012 compiler Strange memory deallocation issues
                            
                                openssl: reducing memory usage
                            
                                Android Fragment Memory Leaks
                            
                                Reading Really big Files With Java

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Why is this a conflict-free memory bank access?

Tags:

memory

cuda

gpgpu

syntagma

People also ask

1 Answers

Robert Crovella

Recent Activity

Donate For Us