Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Do bank conflicts occur on non-GPU hardware?

This blog post explains how memory bank conflicts kill the transpose function's performance.

Now I can't but wonder: does the same happen on a "normal" cpu (in a multithreaded context)? Or is this specific to CUDA/OpenCL? Or does it not even appear in modern CPUs because of their relatively large cache sizes?

like image 823
rubenvb Avatar asked Jun 19 '14 14:06

rubenvb


1 Answers

There have been bank conflicts since the earliest vector processing CPUs from the 1960's It's caused by interleaved memory or multi-channel memory access.

Interleaved memory access or MCMA solves the problem to slow RAM access, by phasing access to each word of memory from different banks or via different channels. But there is a side effect, memory access from the same bank takes longer than accessing memory from the adjacent bank.

From Wikipedia on the 1980's Cray 2 http://en.wikipedia.org/wiki/Cray-2

"Main memory banks were arranged in quadrants to be accessed at the same time, allowing programmers to scatter their data across memory to gain higher parallelism. The downside to this approach is that the cost of setting up the scatter/gather unit in the foreground processor was fairly high. Stride conflicts corresponding to the number of memory banks suffered a performance penalty (latency) as occasionally happened in power-of-2 FFT-based algorithms. As the Cray 2 had a much larger memory than Cray 1's or X-MPs, this problem was easily rectified by adding an extra unused element to an array to spread the work out"

like image 118
Tim Child Avatar answered Oct 05 '22 02:10

Tim Child