My working assumption is that LINQ is thread-safe when used with the System.Collections.Concurrent collections (including ConcurrentDictionary).
(Other Overflow posts seem to agree: link)
However, an inspection of the implementation of the LINQ OrderBy extension method shows that it appears not to be threadsafe with the subset of concurrent collections which implement ICollection (e.g. ConcurrentDictionary).
The OrderedEnumerable GetEnumerator (source here) constructs an instance of a Buffer struct (source here) which tries to cast the collection to an ICollection (which ConcurrentDictionary implements) and then performs a collection.CopyTo with an array initialised to the size of the collection.
Therefore, if the ConcurrentDictionary (as the concrete ICollection in this case) grows in size during the OrderBy operation, between initialising the array and copying into it, this operation will throw.
The following test code shows this exception:
(Note: I appreciate that performing an OrderBy on a thread-safe collection which is changing underneath you is not that meaningful, but I do not believe it should throw)
using System;
using System.Collections.Concurrent;
using System.Linq;
using System.Threading;
using System.Threading.Tasks;
namespace Program
{
class Program
{
static void Main(string[] args)
{
try
{
int loop = 0;
while (true) //Run many loops until exception thrown
{
Console.WriteLine($"Loop: {++loop}");
_DoConcurrentDictionaryWork().Wait();
}
}
catch (Exception ex)
{
Console.WriteLine(ex);
}
}
private static async Task _DoConcurrentDictionaryWork()
{
var concurrentDictionary = new ConcurrentDictionary<int, object>();
var keyGenerator = new Random();
var tokenSource = new CancellationTokenSource();
var orderByTaskLoop = Task.Run(() =>
{
var token = tokenSource.Token;
while (token.IsCancellationRequested == false)
{
//Keep ordering concurrent dictionary on a loop
var orderedPairs = concurrentDictionary.OrderBy(x => x.Key).ToArray(); //THROWS EXCEPTION HERE
//...do some more work with ordered snapshot...
}
});
var updateDictTaskLoop = Task.Run(() =>
{
var token = tokenSource.Token;
while (token.IsCancellationRequested == false)
{
//keep mutating dictionary on a loop
var key = keyGenerator.Next(0, 1000);
concurrentDictionary[key] = new object();
}
});
//Wait for 1 second
await Task.Delay(TimeSpan.FromSeconds(1));
//Cancel and dispose token
tokenSource.Cancel();
tokenSource.Dispose();
//Wait for orderBy and update loops to finish (now token cancelled)
await Task.WhenAll(orderByTaskLoop, updateDictTaskLoop);
}
}
}
That the OrderBy throws an exception leads to one of a few possible conclusions:
1) My assumption about LINQ being threadsafe with concurrent collections is incorrect, and it is only safe to perform LINQ on collections (be they concurrent or not) which are not mutating during the LINQ query
2) There is a bug with the implementation of LINQ OrderBy and it is incorrect for the implementation to try and cast the source collection to an ICollection and try and perform the collection copy (and It should just drop through to its default behaviour iterating the IEnumerable).
3) I have misunderstood what is going on here...
Thoughts much appreciated!
It's not stated anywhere that OrderBy
(or other LINQ methods) should always use GetEnumerator
of source IEnumerable
or that it should be thread safe on concurrent collections. All that is promised is this method
Sorts the elements of a sequence in ascending order according to a key.
ConcurrentDictionary
is not thread-safe in some global sense either. It's thread-safe with respect to other operations performed on it. Even more, documentation says that
All public and protected members of ConcurrentDictionary are thread-safe and may be used concurrently from multiple threads. However, members accessed through one of the interfaces the ConcurrentDictionary implements, including extension methods, are not guaranteed to be thread safe and may need to be synchronized by the caller.
So, your understanding is correct (OrderBy
will see IEnumerable
you pass to it is really ICollection
, will then get length of that collection, allocate buffer of that size, then will call ICollection.CopyTo
, and this is of course not thread safe on any type of collection), but it's not a bug in OrderBy
because neither OrderBy
nor ConcurrentDictionary
ever promised what you assume.
If you want to do OrderBy
in a thread safe way on ConcurrentDictionary
, you need to rely on methods that are promised to be thread safe. For example:
// note: this is NOT IEnumerable.ToArray()
// but public ToArray() method of ConcurrentDictionary itself
// it is guaranteed to be thread safe with respect to other operations
// on this dictionary
var snapshot = concurrentDictionary.ToArray();
// we are working on snapshot so no one other thread can modify it
// of course at this point real contents of dictionary might not be
// the same as our snapshot
var sorted = snapshot.OrderBy(c => c.Key);
If you don't want to allocate additional array (with ToArray
), you can use Select(c => c)
and it will work in this case, but then we are again in moot territory and relying on something to be safe to use in situation it was not promised to (Select
will also not always enumerate your collection. If collection is array or list - it will shortcut and use indexers instead). So you can create extension method like this:
public static class Extensions {
public static IEnumerable<T> ForceEnumerate<T>(this ICollection<T> collection) {
foreach (var item in collection)
yield return item;
}
}
And use it like this if you want to be safe and don't want to allocate array:
concurrentDictionary.ForceEnumerate().OrderBy(c => c.Key).ToArray();
In this case we are forcing enumeration of ConcurrentDictionary
(which we know is safe from documentation) and then pass that to OrderBy
knowing that it cannot do any harm with that pure IEnumerable
. Note that as correctly pointed out in comments by mjwills, this is not exactly the same as ToArray
, because ToArray
produces snapshot (locks collection preventing modifications while building array) and Select
\ yield
does not acquire any locks (so items might be added\removed right when enumeration is in progress). Though I doubt it matters when doing things like described in question - in both cases after OrderBy
is completed - you have no idea whether your ordered results reflect current state of collection or not.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With