memory barrier and atomic_t on linux

Tags:

Recently, I am reading some Linux kernel space codes, I see this

uint64_t used;
uint64_t blocked;

used = atomic64_read(&g_variable->used);       //#1
barrier();                                     //#2
blocked = atomic64_read(&g_variable->blocked); //#3

What is the semantics of this code snippet? Does it make sure #1 executes before #3 by #2. But I am a litter bit confused, becasue

#A In 64 bit platform, atomic64_read macro is expanded to

used = (&g_variable->used)->counter           // where counter is volatile.

In 32 bits platform, it was converted to use lock cmpxchg8b. I assume these two have the same semantic, and for 64 bits version, I think it means:

all-or-nothing, we can exclude case where address is unaligned and word size large than CPU's native word size.
no optimization, force CPU read from memory location.

atomic64_read doesn't have semantic for preserve read ordering!!! see this

#B the barrier macro is defined as

/* Optimization barrier */
/* The "volatile" is due to gcc bugs */
#define barrier() __asm__ __volatile__("": : :"memory")

From the wiki this just prevents gcc compiler from reordering read and write.

What i am confused is how does it disable reorder optimization for CPU? In addition, can i think barrier macro is full fence?

262

asked Jul 02 '11 06:07

Chang

2 Answers

32-bit x86 processors don't provide simple atomic read operations for 64-bit types. The only atomic operation on 64-bit types on such CPUs that deals with "normal" registers is LOCK CMPXCHG8B, which is why it is used here. The alternative is to use MOVQ and MMX/XMM registers, but that requires knowledge of the FPU state/registers, and requires that all operations on that value are done with the MMX/XMM instructions.

On 64-bit x86_64 processors, aligned reads of 64-bit types are atomic, and can be done with a MOV instruction, so only a plain read is required --- the use of volatile is just to ensure that the compiler actually does a read, and doesn't cache a previous value.

As for the read ordering, the inline assembler you quote ensures that the compiler emits the instructions in the right order, and this is all that is required on x86/x86_64 CPUs, provided the writes are correctly sequenced. LOCKed writes on x86 have a total ordering; plain MOV writes provide "causal consistency", so if thread A does x=1 then y=2, if thread B reads y==2 then a subsequent read of x will see x==1.

On IA-64, PowerPC, SPARC, and other processors with a more relaxed memory model there may well be more to atomic64_read() and barrier().

answered Oct 03 '22 06:10

Anthony Williams

x86 CPUs don’t do read-after-read reordering, so it is sufficient to prevent the compiler from doing any reordering. On other platforms such as PowerPC, things will look a lot different.

answered Oct 03 '22 07:10

Ringding

Related questions
                            
                                linux usb connect/disconnect event
                            
                                Private Git repo - freezes at pulling
                            
                                argparse Python modules in cli
                            
                                Encrypt and decrypt a string of text with RSA and DES3 key
                            
                                html to chm file under linux
                            
                                What are the most important POSIX functions not available in Android?
                            
                                Difference between KLIPS and Netkey IPSEC stacks in Linux [closed]
                            
                                Full process name from task_struct
                            
                                Explain Linux kernel state terminology e.g. net.next, linux-next, net.git
                            
                                Is it possible to "punch holes" through mmap'ed anonymous memory?
                            
                                Why does this movq instruction work on linux and not osx?
                            
                                What's the meaning of 'blacklisted' on GStreamer?
                            
                                Can't find nor install mysql_config
                            
                                How to build a project using Cargo in an offline environment?
                            
                                Save and restore terminal content
                            
                                How to use cURL and mail in AWS Lambda
                            
                                alternative to cron? [closed]
                            
                                What does poll() do with a timeout of 0?
                            
                                Feasibility of C# development with Mono
                            
                                How do I get Ctrl-Backspace to delete a word in vim within gnome-terminal?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

memory barrier and atomic_t on linux

Tags:

linux

multithreading

atomic

concurrency

barrier

Chang

People also ask

2 Answers

Anthony Williams

Ringding

Recent Activity

Donate For Us