How to get memory bandwidth from memory clock/memory speed

1 Answers

The Titan has a 384bit bus while a GTX 680 only has 256, hence 50% more memory bandwidth (assuming clock and latencies are identical.

Edit: I'll try to explain the whole concept a bit more: the following is a simplified model of the factors that determine the performance of RAM (not only on a graphics cards).

Factor A: Frequency

RAM is running at a clock speed. RAM running at 1 GHz "ticks" 1,000,000,000 (a billion) times a second. With every tick, it can receive or send one bit on every lane. So a theoretical RAM module with only one memory lane running at 1GHz would deliver 1 Gigabit per second, since there are 8 bits to the bytes that means 125 Megabyte per second.

Factor B: "Pump Rate"

DDR-RAM (Double Data Rate) can deliver two bits per tick, and there even are "quad-pumped" buses that deliver four bits per tick, but I haven't heard of the latter being used on graphics cards.

Factor C: Bus width.

RAM doesn't just have one single lane to send data. Even the Intel 4004 had a 4 bit bus. The graphics cards you linked have 256 bus lanes and 384 bus lanes respectively.

All of the above factors are multiplied to calculate the theoretical maximum at which data can be sent or received:

**Maximum throughput in bytes per second= Frequency * Pumprate * BusWidth / 8 **

Now lets do the math for the two graphics cards you linked. They both seem to use the same type of RAM (GDDR5 with a pump rate of 2), both running at 3 GHz.

GTX-680: 3 Gbps * 2 * 256 / 8 = 192 GB/s

GTX-Titan: 3 Gbps * 2 * 384 / 8 = 288 GB/s

Factor D: Latency - or reality kicks in

This factor is a LOT harder to calculate than all of the above combined. Basically, when you tell your RAM "hey, I want this data", it takes a while until it comes up with the answer. This latency depends on a number of things and is really hard to calculate, and usually results in RAM systems delivering way less than their theoretical maxima. This is where all the timings, prefetching and tons of other stuff comes into the picture. Since it's not just numbers that could be used for marketing, where higher numbers translate to "better", the marketing focus is mostly on other stuff. And in case you wondered, that is mostly where GDDR5 differs from the DDR3 you've got on your mainboard.

104

answered Jan 02 '23 02:01

Hazzit

Related questions
                            
                                Equivalent of cudaGetErrorString for cuBLAS?
                            
                                Nvidia Tesla vs 480 for CUDA programming [closed]
                            
                                About warp voting function
                            
                                What's the meaning of the params x,y,z,w in function cudaCreateChannelDesc
                            
                                Executing CPU/GPU instructions from managed code
                            
                                maximum number of threads per block
                            
                                Division of floating point numbers on GPU different from that on CPU
                            
                                How to get the Graphics Card Model Name in OpenGL or Win32?
                            
                                How to force tensorflow to use all available GPUs?
                            
                                Why is Keras LSTM on CPU three times faster than GPU?
                            
                                AIR renderMode GPU vs renderMode direct
                            
                                Persistent threads in OpenCL and CUDA
                            
                                Can I run Cuda or OpenCl on Intel processor graphics I7 (3rd or 4rd generation)
                            
                                Keras does not use GPU - how to troubleshoot?
                            
                                Using random numbers with GPUs
                            
                                How does a graphics driver programmatically communicate from CPU to GPU?
                            
                                tensorflow code optimization strategy
                            
                                Are triangles a gpu restriction or are there other rendering pathways?
                            
                                nvidia-smi does not display memory usage [closed]
                            
                                Solving dense linear systems AX = B with CUDA

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How to get memory bandwidth from memory clock/memory speed

Tags:

gpu

memory-bandwidth

Blue_Black

People also ask

1 Answers

Hazzit

Recent Activity

Donate For Us