Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to guarantee that an update to "reference type" item in Array is visible to other threads?

private InstrumentInfo[] instrumentInfos = new InstrumentInfo[Constants.MAX_INSTRUMENTS_NUMBER_IN_SYSTEM];

public void SetInstrumentInfo(Instrument instrument, InstrumentInfo info)
{
    if (instrument == null || info == null)
    {
        return;
    }
    instrumentInfos[instrument.Id] = info;  // need to make it visible to other threads!
}

public InstrumentInfo GetInstrumentInfo(Instrument instrument)
{
    return instrumentInfos[instrument.Id];  // need to obtain fresh value!
}

SetInstrumentInfo and GetInstrumentInfo are called from different threads. InstrumentInfo is immutable class. Am I guaranteed to have the most recent copy when calling GetInstrumentInfo? I'm afraid that I can receive "cached" copy. Should I add kind of synchronization?

Declaring instrumentInfos as volatile wouldn't help because I need declare array items as volatile, not array itself.

Do my code has problem and if so how to fix it?

UPD1:

I need my code to work in real life not to correspond to all specifications! So if my code works in real life but will not work "theoretically" on some computer under some environment - that's ok!

  • I need my code to work on modern X64 server (currently 2 processors HP DL360p Gen8) under Windows using latest .NET Framework.
  • I don't need to work my code under strange computers or Mono or anything else
  • I don't want to introduce latency as this is HFT software. So as " Microsoft's implementation uses a strong memory model for writes. That means writes are treated as if they were volatile" I likely don't need to add extra Thread.MemoryBarrier which will do nothing but add latency. I think we can rely that Microsoft will keep using "strong memory model" in future releases. At least it's very unlikely that Microsoft will change memory model. So let's assume it will not.

UPD2:

The most recent suggestion was to use Thread.MemoryBarrier();. Now I don't understand exact places where I must insert it to make my program works on standard configuration (x64, Windows, Microsoft .NET 4.0). Remember I don't want to insert lines "just to make it possible to launch your program on IA64 or .NET 10.0". Speed is more important for me than portability. However it would be also interesting how to update my code so it will work on any computer.

UPD3

.NET 4.5 solution:

    public void SetInstrumentInfo(Instrument instrument, InstrumentInfo info)
    {
        if (instrument == null || info == null)
        {
            return;
        }
        Volatile.Write(ref instrumentInfos[instrument.Id], info);
    }

    public InstrumentInfo GetInstrumentInfo(Instrument instrument)
    {
        InstrumentInfo result = Volatile.Read(ref instrumentInfos[instrument.Id]);
        return result;
    }
like image 726
Oleg Vazhnev Avatar asked Aug 14 '12 10:08

Oleg Vazhnev


3 Answers

This is a question with a long and complicated answer, but I'll try to distill it into some actionable advice.

1. Simple solution: only access instrumentInfos under a lock

The easiest way to avoid the unpredictability in multi-threaded programs is to always protect shared state using locks.

Based on your comments, it sounds like you consider this solution to be too expensive. You may want to double-check that assumption, but if that's really the case, then let's look at the remaining options.

2. Advanced solution: Thread.MemoryBarrier

Alternatively, you can use Thread.MemoryBarrier:

private InstrumentInfo[] instrumentInfos = new InstrumentInfo[Constants.MAX_INSTRUMENTS_NUMBER_IN_SYSTEM]; 

public void SetInstrumentInfo(Instrument instrument, InstrumentInfo info) 
{ 
    if (instrument == null || info == null) 
    { 
        return; 
    } 

    Thread.MemoryBarrier(); // Prevents an earlier write from getting reordered with the write below

    instrumentInfos[instrument.Id] = info;  // need to make it visible to other threads! 
} 

public InstrumentInfo GetInstrumentInfo(Instrument instrument) 
{ 
    InstrumentInfo info = instrumentInfos[instrument.Id];  // need to obtain fresh value! 
    Thread.MemoryBarrier(); // Prevents a later read from getting reordered with the read above
    return info;
}

Using the Thread.MemoryBarrier before the write and after the read prevents the potential trouble. The first memory barrier prevents the writing thread from reordering a write that initializes the object's field with the write that publishes the object into the array, and the second memory barrier prevents the reading thread from reordering the read that receives the object from the array with any subsequent reads of the fields of that object.

As a side note, .NET 4 also exposes Thread.VolatileRead and Thread.VolatileWrite that use Thread.MemoryBarrier as shown above. However, there is no overload of Thread.VolatileRead and Thread.VolatileWrite for reference types other than System.Object.

3. Advanced solution (.NET 4.5): Volatile.Read and Volatile.Write

.NET 4.5 exposes Volatile.Read and Volatile.Write methods that are more efficient than full memory barriers. If you are targeting .NET 4, this option won't help.

4. "Wrong but will happen to work" solution

You should never ever rely on what I'm about to say. But... it is very unlikely that you'd be able to reproduce the issue that is present in your original code.

In fact, on X64 in .NET 4, I would be extremely surprised if you could ever reproduce it. X86-X64 provides fairly strong memory reordering guarantees, and so these kinds of publication patterns happen to work correctly. The .NET 4 C# compiler and .NET 4 CLR JIT compiler also avoid optimizations that would break your pattern. So, none of the three components that are allowed to reorder the memory operations will happen to do so.

That said, there are (somewhat obscure) variants of the publication pattern that actually don't work on .NET 4 in X64. So, even if you think that the code will never need to run on any architecture other than .NET 4 X64, your code will be more maintainable if you use one of the correct approaches, even though the issue is not presently reproducible on your server.

like image 138
Igor ostrovsky Avatar answered Oct 23 '22 20:10

Igor ostrovsky


You can use a system-level synchronization primitive (like Mutex); but those are bit heavy-handed. Mutex is really slow because it's a kernal-level primitive. This means you can have mutually exclusive code across processes. This isn't want you need and you could use something much less costly in performance like lock or Monitor.Enter or Monitor.Exit that works within a single process.

You can't use something like VolatileRead/Write or MemoryBarrier because you need to make the act of writing to the collection atomic if elements in the array are not assigned atomically. VolatileRead/Write or MemoryBarrier doesn't do that. These just give you acquire and release semantics. It means that whatever is written to is "visible" to other threads. If you don't make the write to the collection atomic, you can corrupt data--acquire/release semantics won't help with that.

e.g.:

private InstrumentInfo[] instrumentInfos = new InstrumentInfo[Constants.MAX_INSTRUMENTS_NUMBER_IN_SYSTEM];

private readonly object locker = new object();

public void SetInstrumentInfo(Instrument instrument, InstrumentInfo info)
{
    if (instrument == null || info == null)
    {
        return;
    }
    lock(locker)
    {
        instrumentInfos[instrument.Id] = info;
    }
}

public InstrumentInfo GetInstrumentInfo(Instrument instrument)
{
    lock(locker)
    {
        return instrumentInfos[instrument.Id];
    }
}

A concurrent collection would do that same; but, you incur a cost because every operation is guarded by synchronization. Plus, a concurrent collection only knows about atomicity of accesses to elements, you'd still have to ensure atomicity at your applications level: If you did the following with a concurrent collection:

public bool Contains(Instrument instrument)
{
   foreach(var element in instrumentInfos)
   {
       if(element == instrument) return true;
   }
}

...you'd have at least a couple problems. One, you're not stopping SetInstrumentInfo from modifying the collection while it's being enumerated (many collections do not support this and throw and exception). i.e. the collection is only "guarded" in the retrieval of a single element. Two, the collection is guarded at each iteration. If you have 100 elements and the concurrent collection uses a lock, you get 100 locks (assuming the element to find is last or not found at all). This would be much slower than it needs to be. If you don't use a concurrent collection you could simply use lock to get the same result:

public bool Contains(Instrument instrument)
{
   lock(locker)
   {
      foreach(var element in instrumentInfos)
      {
          if(element == instrument) return true;
      }
   }
}

and have a single lock and be much more performant.

UPDATE I think it's important to point out that if InstrumentInfo is a struct, use of lock (or some other synchronization primitive) becomes even more important because it would have value semantics and every assignment would require moving as many bytes as InstrumentInfo uses to store its fields (i.e. it's no longer a 32-bit or 64-bit native word reference assignment). i.e. a simple InstrumentInfo assignment (e.g. to an array element) would never be atomic and thus not thread-safe.

UPDATE 2 If the operation to replace an array element is potentially atomic (becomes simply a reference assignment if InstrumentInfo is a reference and the IL instruction stelem.ref is atomic). (i.e. the act of writing an element to an array is atomic w.r.t. to what I mention above) In which case if the only code that deals with instrumentInfos is what you posted then you can use Thread.MemoryBarrier():

public void SetInstrumentInfo(Instrument instrument, InstrumentInfo info)
{
    if (instrument == null || info == null)
    {
        return;
    }
    Thread.MemoryBarrier();
    instrumentInfos[instrument.Id] = info;
}

public InstrumentInfo GetInstrumentInfo(Instrument instrument)
{
    var result = instrumentInfos[instrument.Id];
    Thread.MemoryBarrier();
    return result;
}

... which would be the equivalent of declaring each element in the array volatile if that were possible.

VolatileWrite won't work because it requires a reference to the variable it will write to, you can't give it the reference to an array element.

like image 27
Peter Ritchie Avatar answered Oct 23 '22 22:10

Peter Ritchie


My preferred way to resolve this is with a critical section. C# has had a built-in language construct called "Lock" that solves this problem. The lock statement makes sure no more than one thread can be inside that critical section at the same time.

private InstrumentInfo[] instrumentInfos = new InstrumentInfo[Constants.MAX_INSTRUMENTS_NUMBER_IN_SYSTEM];

public void SetInstrumentInfo(Instrument instrument, InstrumentInfo info)
{
    if (instrument == null || info == null) {
        return;
    }
    lock (instrumentInfos) {
        instrumentInfos[instrument.Id] = info;
    }
}

public InstrumentInfo GetInstrumentInfo(Instrument instrument)
{
    lock (instrumentInfos) {
        return instrumentInfos[instrument.Id];
    }
}

This does have performance implications, but it always ensures that you will get a reliable result. This is especially useful if you ever iterate over the instrumentInfos object; you'll absolutely need a lock statement for code like that.

Note that lock is a general purpose solution that ensures any complex statements are executed reliably and atomically. In your case, because you're setting an object pointer, you may find that it's possible to use a simpler construct like a thread-safe list or array - provided these are the only two functions that ever touch instrumentInfos. More details are available here: Are C# arrays thread safe?

EDIT: The key question about your code is: what is InstrumentInfo? If it is derived from object, you will need to be careful to always construct a new InstrumentInfo object each time you are updating the shared array since updating an object on a separate thread can cause issues. If it is a struct, the lock statement will provide all the security you need since the values within it will be copied on write and read.

EDIT 2: From a bit more research, it appears that there does exist some cases where changes to a thread-shared variable won't appear in certain compiler configurations. This thread discusses a case where a "bool" value can be set on one thread and never retrieved on the other: Can a C# thread really cache a value and ignore changes to that value on other threads?

I notice, however, that this case only exists where a .NET value type is involved. If you look at Joe Ericson's response. the decompilation of this particular code shows why: the thread-shared value is read into a register, then never re-read since the compiler optimizes away the unnecessary read in the loop.

Given that you're using an array of shared objects, and given that you've encapsulated all accessors, I think you are perfectly safe using your raw methods. HOWEVER: The performance loss by using locking is so tiny as to be virtually forgettable. For example, I wrote a simple test application that tried calling SetInstrumentInfo and GetInstrumentInfo 10 million times. The performance results were, as follows:

  • Without Locking, 10 million set and get calls: 2 seconds 149 ms
  • With locking, 10 million set and get calls: 2 seconds 195 ms

This translates into raw performance of roughly:

  • Without locking, you can execute 4653327 get & set calls per second.
  • With locking, you can execute 4555808 get & set calls per second.

From my perspective, the lock() statement is worth the 2.1% performance tradeoff. In my experience, get and set methods like these occupy only the tiniest fraction of the application's execution time. You'd probably be best off looking for potential optimizations elsewhere, or spending additional money to get a higher clock rate CPU.

like image 37
Ted Spence Avatar answered Oct 23 '22 20:10

Ted Spence