Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Word Tearing on x86

Under what circumstances is it unsafe to have two different threads simultaneously writing to adjacent elements of the same array on x86? I understand that on some DS9K-like architectures with insane memory models this can cause word tearing, but on x86 single bytes are addressable. For example, in the D programming language real is an 80-bit floating point type on x86. Would it be safe to do something like:

real[] nums = new real[4];  // Assume new returns a 16-byte aligned block.
foreach(i; 0..4) {
    // Create a new thread and have it do stuff and 
    // write results to index i of nums.
}

Note: I know that, even if this is safe, it can sometimes cause false sharing problems with the cache, leading to slow performance. However, for the use cases I have in mind writes will be infrequent enough for this not to matter in practice.

Edit: Don't worry about reading back the values that are written. The assumption is that there will be synchronization before any values are read. I only care about the safety of writing in this way.

like image 798
dsimcha Avatar asked Oct 22 '09 13:10

dsimcha


1 Answers

The x86 has coherent caches. The last processor to write to a cache line acquires the whole thing and does a write to the cache. This ensures that single byte and 4 byte values written on corresponding values are atomically updated.

That's different than "its safe". If the processors each only write to bytes/DWORDS "owned" by that processor by design, then the updates will be correct. In practice, you want one processor to read values written by others, and that requires synchronization.

It is also different than it is "efficient". If several processors can each write to different places in the cache line, then the cache line can ping-pong between CPUs and that's a lot more expensive than if it the cache line goes to a single CPU and stays there. The usual rule is to put processor-specific data in its own cache line. Of course, if you are only going to write to just that one word, just once, and the amount of work is significant compared to a cache-line move, then your performance will be acceptable.

like image 198
Ira Baxter Avatar answered Oct 04 '22 03:10

Ira Baxter