What is coherent memory on GPU?

Tags:

I have stumbled not once into a term "non coherent" and "coherent" memory in the

tech papers related to graphics programming.I have been searching for a simple and clear explanation,but found mostly 'hardcore' papers of this type.I would be glad to receive layman's style answer on what coherent memory actually is on GPU architectures and how it is compared to other (probably not-coherent) memory types.

495

asked Mar 26 '16 21:03

Michael IV

2 Answers

Memory is memory. But different things can access that memory. The GPU can access memory, the CPU can access memory, maybe other hardware bits, whatever.

A particular thing has "coherent" access to memory if changes made by others to that memory are visible to the reader. Now, you might think this is foolishness. After all, if the memory has been changed, how could someone possibly be unable to see it?

Simply put, caches.

It turns out that changing memory is expensive. So we do everything possible to avoid changing memory unless we absolutely have to. When you write a single byte from the CPU to a pointer in memory, the CPU doesn't write that byte yet. Or at least, not to memory. It writes it to a local copy of that memory called a "cache."

The reason for this is that, generally speaking, applications do not write (or read) single bytes. They are more likely to write (and read) lots of bytes, in small chunks. So if you're going to perform an expensive operation like a memory load or store, you should load or store a large chunk of memory. So you store all of the changes you're going to make to a chunk of memory in a cache, then make a single write of that cached chunk to actual memory at some point in the future.

But if you have two separate devices that use the same memory, you need some way to be certain that writes one device makes are visible to other devices. Most GPUs can't read the CPU cache. And most CPU languages don't have language-level support to say "hey, that stuff I wrote to memory? I really mean for you to write it to memory now." So you usually need something to ensure visibility of changes.

In Vulkan, memory which is labeled by VK_MEMORY_PROPERTY_HOST_COHERENT_BIT means that, if you read/write that memory (via a mapped pointer, since that's the only way Vulkan lets you directly write to memory), you don't need to use functions vkInvalidateMappedMemoryRanges/vkFlushMappedMemoryRanges to make sure the CPU/GPU can see those changes. The visibility of any changes is guaranteed in both directions. If that flag isn't available on the memory, then you must use the aforementioned functions to ensure the coherency of the specific regions of data you want to access.

With coherent memory, one of two things is going on in terms of hardware. Either CPU access to the memory is not cached in any of the CPU's caches, or the GPU has direct access to the CPU's caches (perhaps due to being on the same die as the CPU(s)). You can usually tell that the latter is happening, because on-die GPU implementations of Vulkan don't bother to offer non-coherent memory options.

answered Oct 11 '22 01:10

Nicol Bolas

If memory is coherent then all threads accessing that memory must agree on the state of the memory at all times, e.g.: if thread 0 reads memory location A and thread 1 reads the same location at the same time, both threads should always read the same value.

But if memory is not coherent then threads A and B might read back different values. Thread 0 could think that location A contains a 1, while thread thinks that that location contains a 2. The different threads would have an incoherent view of the memory.

Coherence is hard to achieve with a high number of cores. Often every core must be aware of memory accesses from all other cores. So if you have 4 cores in a quad core CPU, coherence is not that hard to achieve as every core must be informed about the memory accesses addresses of 3 other cores, but in a GPU with 16 cores, every core must be made aware of the memory accesses by 15 other cores. The cores exchange data about the content of their cache using so called "cache coherence protocols".

This is why GPUs often only support limited forms of coherency. If some memory locations are read only or are only accessed by a single thread, then no coherence is required. If caches are small and coherence is not always required but only at specific instructions of the program, then it is possible to achieve correct behavior of the program using cache flushes before or after specific memory accesses.

If your hardware offers both coherent and non-coherent memory types, then you can expect that non-coherent memory will be faster, but if you try to run parallel algorithms using this memory they will fail in really weird ways.

answered Oct 11 '22 03:10

Jan Lucas

Related questions
                            
                                OpenGL fast texture drawing with vertex buffer objects. Is this the way to do it?
                            
                                Visualising Android AudioTrack from a ByteStream
                            
                                Menu Icon Confusion - They're too big!
                            
                                Most efficient method for GLSL edge detection shader
                            
                                Hidden Line Removal algorithm for 3D meshes?
                            
                                Javascript Canvas pixelformat
                            
                                Zooming an image aimed at the mouse cursor
                            
                                Trying to get Gloss Graphics library working
                            
                                problems about texture coordinate in Obj format
                            
                                how to fill contour colors and write axes names in RSM (R)
                            
                                How do you use the preprocessor for making a cross-platform library?
                            
                                Drawing with mouse causes gaps between pixels
                            
                                How to make a checkerboard of JButtons
                            
                                what is meant by symmetric DDA?
                            
                                Clipping rectangle with c#
                            
                                Converting vertex normals to face normals
                            
                                Need to multiply one XMM register by another, but with bit masked value
                            
                                Changing ggplot2 legend title without altering graphical parameters
                            
                                Android AvoidXfermode is deprecated since API 16, is there a replacement?
                            
                                Parallax view scrolling (Yahoo weather like)

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

What is coherent memory on GPU?

Tags:

graphics

gpgpu

gpu

vulkan

Michael IV

People also ask

2 Answers

Nicol Bolas

Jan Lucas

Recent Activity

Donate For Us