I'm currently looking at a copy-on-write set implementation and want to confirm it's thread safe. I'm fairly sure the only way it might not be is if the compiler is allowed to reorder statements within certain methods. For example, the <code>Remove</code> method looks like: <pre class="prettyprint"><code>public bool Remove(T item) { var newHashSet = new HashSet<T>(hashSet); var removed = newHashSet.Remove(item); hashSet = newHashSet; return removed; } </code></pre> Where hashSet is defined as <pre class="prettyprint"><code>private volatile HashSet<T> hashSet; </code></pre> So my question is, given that hashSet is <code>volatile</code> does it mean that the <code>Remove</code> on the new set happens before the write to the member variable? If not, then other threads may see the set before the remove has occurred. I haven't actually seen any issues with this in production, but I just want to confirm it is guaranteed to be safe. UPDATE To be more specific, there's another method to get an <code>IEnumerator</code>: <pre class="prettyprint"><code>public IEnumerator<T> GetEnumerator() { return hashSet.GetEnumerator(); } </code></pre> So the more specific question is: is there a guarantee that the returned <code>IEnumerator</code> will never throw a <code>ConcurrentModificationException</code> from a remove? UPDATE 2 Sorry, the answers are all addressing the thread safety from multiple writers. Good points are raised, but that's not what I'm trying to find out here. I'd like to know if the compiler is allowed to re-order the operations in <code>Remove</code> to something like this: <pre class="prettyprint"><code> var newHashSet = new HashSet<T>(hashSet); hashSet = newHashSet; // swapped var removed = newHashSet.Remove(item); // swapped return removed; </code></pre> If this was possible, it would mean that a thread could call <code>GetEnumerator</code> after <code>hashSet</code> had been assigned, but before <code>item</code> was removed, which could lead to the collection being modified during enumeration. Joe Duffy has a blog article that states: <blockquote> Volatile on loads means ACQUIRE, no more, no less. (There are additional compiler optimization restrictions, of course, like not allowing hoisting outside of loops, but let’s focus on the MM aspects for now.) The standard definition of ACQUIRE is that subsequent memory operations may not move before the ACQUIRE instruction; e.g. given { ld.acq X, ld Y }, the ld Y cannot occur before ld.acq X. However, previous memory operations can certainly move after it; e.g. given { ld X, ld.acq Y }, the ld.acq Y can indeed occur before the ld X. The only processor Microsoft .NET code currently runs on for which this actually occurs is IA64, but this is a notable area where CLR’s MM is weaker than most machines. Next, all stores on .NET are RELEASE (regardless of volatile, i.e. volatile is a no-op in terms of jitted code). The standard definition of RELEASE is that previous memory operations may not move after a RELEASE operation; e.g. given { st X, st.rel Y }, the st.rel Y cannot occur before st X. However, subsequent memory operations can indeed move before it; e.g. given { st.rel X, ld Y }, the ld Y can move before st.rel X. </blockquote> The way I read this is that the call to <code>newHashSet.Remove</code> requires a <code>ld newHashSet</code> and the write to <code>hashSet</code> requires a <code>st.rel newHashSet</code>. From the above definition of RELEASE no loads can move after the store RELEASE, so the statements cannot be reordered! Could someone confirm please confirm my interpretation is correct?

Consider using Interlocked.Exchange - it will guarantee ordering, or Interlocked.CompareExchange which as side benifit will let you detect (and potentially recover from) simultaneous writes to collection. Clearly it adds some additional level of synchronization so it is different from your current code, but easier to reason about. <pre class="prettyprint"><code>public bool Remove(T item) { var old = hashSet; var newHashSet = new HashSet<T>(old); var removed = newHashSet.Remove(item); var current = Interlocked.CompareExchange(ref hashSet, newHashSet, old); if (current != old) { // failed... throw or retry... } return removed; } </code></pre> And I think you'll still need <code>volatile hashSet</code> in this case.

EDITED: Thanks for clarifying the presence of an external lock for calls to Remove (and other collection mutations). Because of RELEASE semantics you will not end up storing a new value to <code>hashSet</code> until after the value to the variable <code>removed</code> has been assigned (because <code>st removed</code> can't be moved after <code>st.rel hashSet</code>). Therefore, the 'snapshot' behaviour of <code>GetEnumerator</code> will work as intended, at least with respect to Remove and other mutators implemented in a similar fashion.

Reordering of operations around volatile

Tags:

c#

volatile

lock-free

I'm currently looking at a copy-on-write set implementation and want to confirm it's thread safe. I'm fairly sure the only way it might not be is if the compiler is allowed to reorder statements within certain methods. For example, the Remove method looks like:

public bool Remove(T item)
{
    var newHashSet = new HashSet<T>(hashSet);
    var removed = newHashSet.Remove(item);
    hashSet = newHashSet;
    return removed;
}

Where hashSet is defined as

private volatile HashSet<T> hashSet;

So my question is, given that hashSet is volatile does it mean that the Remove on the new set happens before the write to the member variable? If not, then other threads may see the set before the remove has occurred.

I haven't actually seen any issues with this in production, but I just want to confirm it is guaranteed to be safe.

UPDATE

To be more specific, there's another method to get an IEnumerator:

public IEnumerator<T> GetEnumerator()
{
    return hashSet.GetEnumerator();
}

So the more specific question is: is there a guarantee that the returned IEnumerator will never throw a ConcurrentModificationException from a remove?

UPDATE 2

Sorry, the answers are all addressing the thread safety from multiple writers. Good points are raised, but that's not what I'm trying to find out here. I'd like to know if the compiler is allowed to re-order the operations in Remove to something like this:

    var newHashSet = new HashSet<T>(hashSet);
    hashSet = newHashSet;                  // swapped
    var removed = newHashSet.Remove(item); // swapped
    return removed;

If this was possible, it would mean that a thread could call GetEnumerator after hashSet had been assigned, but before item was removed, which could lead to the collection being modified during enumeration.

Joe Duffy has a blog article that states:

Volatile on loads means ACQUIRE, no more, no less. (There are additional compiler optimization restrictions, of course, like not allowing hoisting outside of loops, but let’s focus on the MM aspects for now.) The standard definition of ACQUIRE is that subsequent memory operations may not move before the ACQUIRE instruction; e.g. given { ld.acq X, ld Y }, the ld Y cannot occur before ld.acq X. However, previous memory operations can certainly move after it; e.g. given { ld X, ld.acq Y }, the ld.acq Y can indeed occur before the ld X. The only processor Microsoft .NET code currently runs on for which this actually occurs is IA64, but this is a notable area where CLR’s MM is weaker than most machines. Next, all stores on .NET are RELEASE (regardless of volatile, i.e. volatile is a no-op in terms of jitted code). The standard definition of RELEASE is that previous memory operations may not move after a RELEASE operation; e.g. given { st X, st.rel Y }, the st.rel Y cannot occur before st X. However, subsequent memory operations can indeed move before it; e.g. given { st.rel X, ld Y }, the ld Y can move before st.rel X.

The way I read this is that the call to newHashSet.Remove requires a ld newHashSet and the write to hashSet requires a st.rel newHashSet. From the above definition of RELEASE no loads can move after the store RELEASE, so the statements cannot be reordered! Could someone confirm please confirm my interpretation is correct?

449

asked Jul 06 '12 07:07

SimonC

2 Answers

Consider using Interlocked.Exchange - it will guarantee ordering, or Interlocked.CompareExchange which as side benifit will let you detect (and potentially recover from) simultaneous writes to collection. Clearly it adds some additional level of synchronization so it is different from your current code, but easier to reason about.

public bool Remove(T item) 
{ 
    var old = hashSet;
    var newHashSet = new HashSet<T>(old); 
    var removed = newHashSet.Remove(item); 
    var current = Interlocked.CompareExchange(ref hashSet, newHashSet, old);
    if (current != old)
    {
       // failed... throw or retry...
    }
    return removed; 
}

And I think you'll still need volatile hashSet in this case.

answered Oct 01 '22 01:10

Alexei Levenkov

EDITED:

Thanks for clarifying the presence of an external lock for calls to Remove (and other collection mutations).

Because of RELEASE semantics you will not end up storing a new value to hashSet until after the value to the variable removed has been assigned (because st removed can't be moved after st.rel hashSet).

Therefore, the 'snapshot' behaviour of GetEnumerator will work as intended, at least with respect to Remove and other mutators implemented in a similar fashion.

answered Oct 01 '22 03:10

Monroe Thomas

Related questions
                            
                                How do I listen for SQL Server database changes
                            
                                How to share cshtml pages across multiple projects?
                            
                                Ninject: Injecting two different objects of same type
                            
                                Are Oracle Synonyms Supported in Entity Framework?
                            
                                Multiple Joins in Entity Framework query
                            
                                Detect whether a Windows reboot was due to Windows updates
                            
                                Is there an open source MemoryStream implementation that does not use contiguous memory?
                            
                                Error in running WPF application
                            
                                How do I organize C# classes that inherit from one another, but also have properties that inherit from one another?
                            
                                Attempted to read or write protected memory. This is often an indication that other memory is corrupt
                            
                                c# Iterate array of parent type to invoke non-polymorphic method on derived type
                            
                                Hiding DateTimePicker's text and Usercontrol resize
                            
                                C# Static Analysis, possible values for a variable/parameter
                            
                                How to encapsulate a property in a base class?
                            
                                Managed C++ and AnyCPU
                            
                                Easiest way to read 2-channel samples into array from WaveStream
                            
                                pass a Type as an attribute argument
                            
                                How to Handle WebFaultException to return CustomException?
                            
                                Get assembly reference without reflection or a known type
                            
                                How to prevent Factory Method pattern causing warning about virtual member call in constructor?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With