Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

.ToList() thread-safe

I've have the following situation:

  • There is only one thead which only adds items to a list
  • Multiple threads which can read the items.

I know when we directly expose the list, I will ran into concurrent modification while enumerating.

But when I call .ToList() it makes a instance of the List and copies all items with the Array.Copy function of the .NET Framework.

What do you mean, is it safe when using this approach when dirty reads are ok?

using System;
using System.Collections.Generic;
using System.Linq;
using System.Threading;

namespace ConsoleApplication1
{
    class Program
    {
        static void Main(string[] args)
        {
            var container = new Container();
            Thread writer = new Thread(() =>
            {
                for (int i = 0; i < 1000; i++)
                {
                    container.Add(new Item("Item " + i));
                }
            });

            writer.Start();

            Thread[] readerThreads = new Thread[10];
            for (int i = 0; i < readerThreads.Length; i++)
            {
                readerThreads[i] = new Thread(() =>
                {
                    foreach (Item item in container.Items)
                    {
                        Console.WriteLine(item.Name);
                    }
                });
                readerThreads[i].Start();
            }

            writer.Join();

            for (int i = 0; i < readerThreads.Length; i++)
            {
                readerThreads[i].Join();
            }

            Console.ReadLine();
        }
    }

    public class Container
    {
        private readonly IList<Item> _items;

        public Container()
        {
            _items = new List<Item>();
        }

        public IReadOnlyList<Item> Items
        {
            get { return _items.ToList().AsReadOnly(); }
        }

        public void Add(Item item)
        {
            _items.Add(item);
        }
    }

    public class Item
    {
        private readonly string _name;

        public Item(string name)
        {
            _name = name;
        }

        public string Name
        {
            get { return _name; }
        }
    }
}
like image 295
crip Avatar asked Sep 16 '15 17:09

crip


People also ask

Is ToList thread-safe?

The toList() extension function is not thread-safe, so it may fail if the collection is modified in the background thread.

What does ToList () do?

The tolist() function is used to convert a given array to an ordinary list with the same items, elements, or values.

Is ToList a deep copy?

ToList() makes a shallow copy. The references are copied, but the new references still point to the same instances as the original references point to.

Is .NET list thread-safe?

NET Framework 4 introduces the System. Collections. Concurrent namespace, which includes several collection classes that are both thread-safe and scalable. Multiple threads can safely and efficiently add or remove items from these collections, without requiring additional synchronization in user code.


2 Answers

is it safe when using this approach when dirty reads are ok?

Easy answer - no, it's not.

First, the method does not guarantee that by definition. Even if you look at the current implementation (which you should not do), you'll see that it's using List<T> constructor accepting the IEnumerable<T>. Then it checks for ICollection<T> and if yes, uses CopyTo method or just iterates the enumerable. Both are unsafe, because the first one relies on the (unknown) implementation of CopyTo method while the second will receive exception when (the standard behavior) the collection enumerator is invalidated during the Add. And even if the source is a List<T> and is using the Array method you mentioned http://referencesource.microsoft.com/#mscorlib/system/collections/generic/list.cs,d2ac2c19c9cf1d44

Array.Copy(_items, 0, array, arrayIndex, _size);

still is unsafe because it may call copy with _items and _size beyond the _items.Length which of course will throw exception. Your assumption would have been correct only if this code is somehow getting both members atomically, but it doesn't.

So, simple don't do that.

EDIT: All above applies to your concrete question. But I think there is a simple solution for the situation you have explained. If single thread add/multiple thread read with acceptable stale reads is the requirement, then it could be achieved by using the ImmutableList<T> class from the System.Collections.Immutable package like this:

public class Container
{
    private ImmutableList<Item> _items = ImmutableList<Item>.Empty;
    public IReadOnlyList<Item> Items { get { return _items; } }
    public void Add(Item item) { _items = _items.Add(item); }
}
like image 179
Ivan Stoev Avatar answered Oct 10 '22 10:10

Ivan Stoev


What I would do is ditch your current implementation and switch to a BlockingCollection

static void Main(string[] args)
{
    var container = new BlockingCollection<Item>();
    Thread writer = new Thread(() =>
    {
        for (int i = 0; i < 1000; i++)
        {
            container.Add(new Item("Item " + i));
        }
        container.CompleteAdding();
    });

    writer.Start();

    Thread[] readerThreads = new Thread[10];
    for (int i = 0; i < readerThreads.Length; i++)
    {
        readerThreads[i] = new Thread(() =>
        {
            foreach (Item item in container.GetConsumingEnumerable())
            {
                Console.WriteLine(item.Name);
            }
        });
        readerThreads[i].Start();
    }

    writer.Join();

    for (int i = 0; i < readerThreads.Length; i++)
    {
        readerThreads[i].Join();
    }

    Console.ReadLine();
}

What a BlockingCollection will do is give you a thread safe queue (By default it uses a ConcurrentQueue as the backing storage, you can change the behavior to a stack by passing a ConcurrentStack to the constructor) that will block the foreach loops on the readers when the queue is empty waiting for more data to show up. Once the producer calls container.CompleteAdding() that lets all of the readers stop blocking and exit their foreach loops.

This does change your program's behavior, your old way would have given the same Item to multiple threads, this new code will only give the item inserted to a single "worker" thread. This implementation will also give new items to workers that where inserted in to the list after the worker entered its foreach loop, your old implementation uses a "Snapshot" of the list and does its work on that snapshot. I am willing to bet this is the behavior you actually wanted but you did not realize your old way would have duplicated the processing.

If you did want to preserve the old behavior of multiple threads all using "snapshot in time" lists (multiple threads process the same item, new items added to the list after the snapshot are not processed) switch to using the ConcurrentQueue directly instead of the BlockingCollection, it's GetEnumerator() creates a "moment in time" snapshot of the queue when you create the enumerator and will use that list for the foreach.

static void Main(string[] args)
{
    var container = new ConcurrentQueue<Item>();
    Thread writer = new Thread(() =>
    {
        for (int i = 0; i < 1000; i++)
        {
            container.Enqueue(new Item("Item " + i));
        }
    });

    writer.Start();

    Thread[] readerThreads = new Thread[10];
    for (int i = 0; i < readerThreads.Length; i++)
    {
        readerThreads[i] = new Thread(() =>
        {
            foreach (Item item in container)
            {
                Console.WriteLine(item.Name);
            }
        });
        readerThreads[i].Start();
    }

    writer.Join();

    for (int i = 0; i < readerThreads.Length; i++)
    {
        readerThreads[i].Join();
    }

    Console.ReadLine();
}
like image 25
Scott Chamberlain Avatar answered Oct 10 '22 10:10

Scott Chamberlain