Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is T[] not better than IEnumerable<T> as parameter type? (Considering threading challenges)

Some programmers argue that it is better to pass IEnumerable<T> parameters over passing implementations like List<T>, and one of the popular reasons to do this is that the API is immediately usable in more scenarios because more collections will be compatible with IEnumerable<T> than any other specific implementation, e.g. List<T>.

The more I dive into multi-threaded development scenarios, the more I am starting to think that IEnumerable<T> is not the correct type either and I will try to explain why below.

Have you ever received the following exception while enumerating a collection?

Collection was modified; enumeration operation may not execute. (an InvalidOperationException)

Collection was modified exception

Basically what causes this is, you are given the collection on one thread, while someone else modifies the collection on another thread.

To circumvent the problem, I then developed a habit to take a snapshot of the collection before I enumerate it, by converting the collection to an array inside the method, like this:

static void LookAtCollection(IEnumerable<int> collection)
{   
    foreach (int Value in collection.ToArray() /* take a snapshot by converting to an array */)
    {
        Thread.Sleep(ITEM_DELAY_MS);
    }
}

My question is this. Wouldn't it be better to code towards arrays instead of enumerables as a general rule, even if it means that callers are now forced to convert their collections to arrays when using your API?

Is this not cleaner and more bullet-proof? (the parameter type is an array instead)

static void LookAtCollection(int[] collection)
{   
    foreach (int Value in collection)
    {
        Thread.Sleep(ITEM_DELAY_MS);
    }
}

The array meets all the requirements of being enumerable and of a fixed length, and the caller is aware of the fact that you will be operating on a snapshot of the collection at that point in time, which can save more bugs.

The only better alternative I can find right now is the IReadOnlyCollection which will then be even more bullet proof because the collection is then also readonly in terms of item-content.

EDIT:

@DanielPryden provided a link to a very nice article "Arrays considered somewhat harmful". And the comments made by the writer "I rarely need a collection which has the rather contradictory properties of being completely mutable, and at the same time, fixed in size" and "In almost every case, there is a better tool to use than an array." kind of convinced me that arrays are not as close to the silver bullet as I had hoped for, and I agree with the risks and loopholes. I think now the IReadOnlyCollection<T> interface is a better alternative than both the array and the IEnumerable<T>, but it kind of leaves me with the question now: Does the callee enforce it by having a IReadOnlyCollection<T> parameter type in the method declaration? Or should the caller still be allowed to decide what implementation of IEnumerable<T> he passes into the method that looks at the collection?

Thanks for all the answers. I learned a lot from these responses.

like image 796
Martin Lottering Avatar asked Feb 20 '13 12:02

Martin Lottering


2 Answers

The real "problem" here is that while T[] is probably a more specific parameter type than would be ideal, and allows the recipient free access to write any element (which may or may not be desirable), IEnumerable<T> is too general. From a type-safety standpoint, the caller can supply a reference to any object which implements IEnumerable<T>, but in reality only certain such objects will actually work and there's no clean way for the caller to know which ones those are.

If T is a reference type, a T[] will be inherently thread-safe for reading, writing, and enumeration, subject to certain conditions (e.g. threads accessing different elements will not interfere at all; if one thread writes an item around the time another thread reads or enumerates it, the latter thread will see either old or new data. None of Microsoft's collection interfaces offer any such guarantees, nor do they provide any means by which a collection can indicate what guarantees it can or cannot make.

My inclination would be to either use T[], or else define an IThreadSafeList<T> where T:class which would include members like CompareExchangeItem that would allow Interlocked.CompareExchange on an item, but would not include things like Insert and Remove which cannot be done in thread-safe fashion.

like image 175
supercat Avatar answered Nov 16 '22 02:11

supercat


Most methods are not thread-safe unless explicitly documented otherwise.

In this case I'd say it's up to the caller to take care of thread-safety. If the caller passes an IEnumerable<T> parameter to a method, he should hardly be surprised if the method enumerates it! If the enumeration is not thread-safe, the caller needs to create a snapshot in a thread-safe manner, and pass in the snapshot.

The callee can't do this by calling ToArray() or ToList() - these methods will also enumerate the collection, and the callee can't know what is required (e.g. a lock) to make a snapshot in a thread-safe way.

like image 23
Joe Avatar answered Nov 16 '22 02:11

Joe