CUDA - determine number of banks in shared memory

Tags:

Shared memory is "striped" into banks. This leads to the whole issue of bank conflicts, as we all know.

Question: But how can you determine how many banks ("stripes") exist in shared memory?

(Poking around NVIDIA "devtalk" forums, it seems that per-block shared memory is "striped" into 16 banks. But how do we know this? The threads suggesting this are a few years old. Have things changed? Is it fixed on all NVIDIA CUDA-capable cards? Is there a way to determine this from the runtime API (I don't see it there, e.g. under cudaDeviceProp)? Is there a manual way to determine it at runtime?)

355

asked Jun 10 '13 15:06

cmo

1 Answers

As @RobertHarvey says, it's documented. The programming guide indicates 16 banks for compute capability 1.x, and 32 banks for compute capability 2.x and 3.x. You can thus make any decisions based on the compute capability (major version) returned in device properties.

The general link to the cuda on-line documentation is contained in the info link for the cuda tag.

182

answered Sep 30 '22 20:09

Robert Crovella

Related questions
                            
                                Why does this work? Using cin to read to a char array smaller than given input
                            
                                Passing Vector of Struct to Function
                            
                                what the differences between char** and char*[]
                            
                                How standard is std::thread?
                            
                                sqlite3_bind_text SQLITE_STATIC vs SQLITE_TRANSIENT for c++ string
                            
                                Why can't -O0 disable gcc compile optimization?
                            
                                Comparison operator performance <= against !=
                            
                                what does ::SomeMethod() Mean - scope resolution operator
                            
                                How to download files from QWebView?
                            
                                How to write a function wrapper for cout that allows for expressive syntax?
                            
                                why does left shift with variables generate different result from that with constant?
                            
                                Difference between idl and odl
                            
                                sem_getvalue() dysfunctionality in Mac OS X - C++
                            
                                Visual Studio - How can I output debug information to debug window?
                            
                                How do iterate through all sub vectors in a vector?
                            
                                Is it efficient to declare a variable in a loop? [duplicate]
                            
                                c++: alternative to 'std::is_fundamental'?
                            
                                How to keep cv qualifier or reference in return type deduction in c++1y?
                            
                                Which data structure should i use for my purpose? [duplicate]
                            
                                Did older versions of C++ use the `int` operator of a class when evaluating the condition in an `if()` statement?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

CUDA - determine number of banks in shared memory

Tags:

c++

cuda

gpu

shared-memory

bank-conflict

cmo

People also ask

1 Answers

Robert Crovella

Recent Activity

Donate For Us