memory barrier and cache flush

Tags:

Is there any archs where a memory barrier is implemented even with a cache flush? I read that memory barrier affects only CPU reordering but I read statements related to the memory barriers: ensures all the cpu will see the value..., but for me it means a cache flush/invalidation.

751

asked Jul 01 '12 14:07

Mark

2 Answers

The exact impact of a memory barrier depends on the specific architecture

CPUs employ performance optimizations that can result in out-of-order execution. The reordering of memory operations (loads and stores) normally goes unnoticed within a single thread of execution, but causes unpredictable behaviour in concurrent programs and device drivers unless carefully controlled. The exact nature of an ordering constraint is hardware dependent, and defined by the architecture's memory ordering model. Some architectures provide multiple barriers for enforcing different ordering constraints.

http://en.wikipedia.org/wiki/Memory_barrier

Current Intel architectures ensure automatic cache consistency across all CPU's, without explicit use of memory barrier or a cache flush instructions.

In symmetric multiprocessor (SMP) systems, each processor has a local cache. The memory system must guarantee cache coherence. False sharing occurs when threads on different processors modify variables that reside on the same cache line. This invalidates the cache line and forces an update, which hurts performance.

http://software.intel.com/en-us/articles/avoiding-and-identifying-false-sharing-among-threads/

answered Sep 22 '22 05:09

Eric J.

On pretty much all modern architectures, caches (like the L1 and L2 caches) are ensured coherent by hardware. There is no need to flush any cache to make memory visible to other CPUs.

One could imagine hypothetically a system that was not cache coherent in hardware, but it wouldn't look anything like the current systems that run operating systems like Windows and Linux.

Memory barriers are needed on these architectures to do three things:

The CPU may pre-fetch a read that's invalidated by a write on another core. This must be prevented. (Though on x86, this is prevented in hardware. The pre-fetch is locked to the L1 cache line, so if another CPU invalidates the cache line, the pre-fetch is invalidated as well.)
The CPU may "post" writes and not put them in its L1 cache yet. These writes must be completed at least to L1 cache.
The CPU may re-order reads and writes on one side of the memory barrier with reads and writes on the other side. Depending on the type of memory barrier, some of these re-orderings must be prohibited. (For example, read x; read y; doesn't ensure the reads happen in that order. But read x; memory_barrier(); read y; typically does.)

answered Sep 24 '22 05:09

David Schwartz

Related questions
                            
                                How to modify kernel DTB file
                            
                                Does accessing an int with a char * potentially have undefined behavior?
                            
                                scanf produces segfault when the program is run with a custom entry point (using gcc 7.4.0)
                            
                                How to force linkage to older libc `fcntl` instead of `fcntl64`?
                            
                                Is it safe to keep a pointer out-of-bounds without dereferencing it? [duplicate]
                            
                                character constant:\000 \xhh
                            
                                How to port this NetHack function to Python?
                            
                                Source code for Xiaolin Wu's line algorithm in C?
                            
                                Which NoSQL db to use with C? [closed]
                            
                                How to get involved in C standardization process?
                            
                                Trailing Array Idiom
                            
                                Unary minus and signed-to-unsigned conversion
                            
                                ALSA: Full duplex C example?
                            
                                How To Use Condition Variable
                            
                                Is there a way to split a string on multiple characters in C?
                            
                                Given `int num[7]`, how do `num` , `&num[0]`, `&num` differ?
                            
                                GCC alias to function outside of translation unit -AKA- is this even the right tool for the job?
                            
                                How to make C (P/invoke) code called from C# "Thread-safe"
                            
                                netbeans c auto popup code completion
                            
                                ctime() return a string, why we don't need to free() this string' s memory?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

memory barrier and cache flush

Tags:

c

caching

memory-barriers

Mark

People also ask

2 Answers

Eric J.

David Schwartz

Recent Activity

Donate For Us