Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Are two consequent CPU stores on x86 flushed to the cache keeping the order?

Assume there are two threads running on x86 CPU0 and CPU1 respectively. Thread running on CPU0 executes the following commands:

A=1
B=1

Cache line containing A initially owned by CPU1 and that containing B owned by CPU0.

I have two questions:

  1. If I understand correctly, both stores will be put into CPU’s store buffer. However, for the first store A=1 the cache of CPU1 must be invalidated while the second store B=1 can be flushed immediately since CPU0 owns the cache line containing it. I know that x86 CPU respects store orders. Does that mean that B=1 will not be written to the cache before A=1?

  2. Assume in CPU1 the following commands are executed:

while (B=0);
print A

Is it enough to add only lfence between the while and print commands in CPU1 without adding a sfence between A=1 and B=1 in CPU0 to get 1 always printed out on x86?

while (B=0);
lfence
print A
like image 916
digger Avatar asked Jan 15 '12 22:01

digger


1 Answers

In x86, writes by a single processor are observed in the same order by all processors. No need to fence in your example, nor in any normal program on x86. Your program:

while(B==0);  // wait for B == 1 to become globally observable
print A;      // now, A will always be 1 here

What exactly happens in cache is model specific. All kinds of tricks and speculative behavior can occur in cache, but the observable behavior always follows the rules.

See Intel System Programming Guide Volume 3 section 8.2.2. for the details on memory ordering.

like image 183
srking Avatar answered Sep 19 '22 13:09

srking