Mistake in List.InsertRange(), or design decision?

Question

I was reading the code for List<T>.InsertRange() in the .NET 4.0 framework, when I noticed a strange peculiarity. Here is the code for reference:

    public void InsertRange(int index, IEnumerable<T> collection) {
        if (collection==null) { 
            ThrowHelper.ThrowArgumentNullException(ExceptionArgument.collection);
        } 

        if ((uint)index > (uint)_size) {
            ThrowHelper.ThrowArgumentOutOfRangeException(ExceptionArgument.index, ExceptionResource.ArgumentOutOfRange_Index); 
        }
        Contract.EndContractBlock();

        ICollection<T> c = collection as ICollection<T>; 
        if( c != null ) {    // if collection is ICollection<T>
            int count = c.Count; 
            if (count > 0) { 
                EnsureCapacity(_size + count);
                if (index < _size) { 
                    Array.Copy(_items, index, _items, index + count, _size - index);
                }

                // If we're inserting a List into itself, we want to be able to deal with that. 
                if (this == c) {
                    // Copy first part of _items to insert location 
                    Array.Copy(_items, 0, _items, index, index); 
                    // Copy last part of _items back to inserted location
                    Array.Copy(_items, index+count, _items, index*2, _size-index); 
                }
                else {
                    T[] itemsToInsert = new T[count];
                    c.CopyTo(itemsToInsert, 0); 
                    itemsToInsert.CopyTo(_items, index);
                } 
                _size += count; 
            }
        } 
        else {
            using(IEnumerator<T> en = collection.GetEnumerator()) {
                while(en.MoveNext()) {
                    Insert(index++, en.Current); 
                }
            } 
        } 
        _version++;
    }

In particular, notice that _version is always incremented at the end of the function. This means that in-progress enumerations over the list will be invalidated at any non-exceptional call to InsertRange, even if the List was not changed. For instance, the following code throws:

static void Main(string [] args) {
    var list = new List<object>() {1, 2 };


    using(var enumerator = list.GetEnumerator()) {
    if(enumerator.MoveNext())
        Console.WriteLine(enumerator.Current);

    list.InsertRange(1, new object[]{});


    if(enumerator.MoveNext()) // ** InvalidOperationException
        Console.WriteLine(enumerator.Current);
    }
}

Modifying the method so that enumeration is not invalidated in this way would not increase execution time at all because the code already checks the size of count. It could be rewritten as follows:

public void InsertRange(int index, IEnumerable<T> collection) {
    if (collection==null) { 
        ThrowHelper.ThrowArgumentNullException(ExceptionArgument.collection);
    } 

    if ((uint)index > (uint)_size) {
        ThrowHelper.ThrowArgumentOutOfRangeException(ExceptionArgument.index, ExceptionResource.ArgumentOutOfRange_Index); 
    }
    Contract.EndContractBlock();

    ICollection<T> c = collection as ICollection<T>; 
    if( c != null ) {    // if collection is ICollection<T>
        int count = c.Count; 
        if (count > 0) { 
            EnsureCapacity(_size + count);
            if (index < _size) { 
                Array.Copy(_items, index, _items, index + count, _size - index);
            }

            // If we're inserting a List into itself, we want to be able to deal with that. 
            if (this == c) {
                // Copy first part of _items to insert location 
                Array.Copy(_items, 0, _items, index, index); 
                // Copy last part of _items back to inserted location
                Array.Copy(_items, index+count, _items, index*2, _size-index); 
            }
            else {
                T[] itemsToInsert = new T[count];
                c.CopyTo(itemsToInsert, 0); 
                itemsToInsert.CopyTo(_items, index);
            } 
            _size += count;
            _version++;
        }
    } 
    else {
        var inserted = false;

        using(IEnumerator<T> en = collection.GetEnumerator()) {
            while(en.MoveNext()) {
                inserted = true;
                Insert(index++, en.Current); 
            }  
        }

        if (inserted) _version++; 
    } 
}

Which has as its only disadvantage an extra local variable (which will probably be JITed into a register), maybe 20 bytes increase in the working set, and an irrelevant amount of extra CPU work when inserting IEnumerables. If the extra bool or the in-loop assignment need to be avoided, the insertion for IEnumerables could be performed as

if(en.MoveNext()) {
    Insert(index++, en.Current); 
    _version++;
}

while(en.MoveNext()) {
    Insert(index++, en.Current); 
}

So...

Is the .NET implementation the intended behavior, or is it a mistake?

EDIT:

I realize that if you are enumeration on one thread while modifying the thread on another, you are doing something wrong. According to the documentation, the behavior in these cases is undefined. However, List<T> does the programmer a favor and throws an exception in these cases. I am not asking if List<T> follows the documentation correctly: it does. I am asking if it is implemented in a way that is not what was intended by Microsoft.

If InsertRange() is behaving as intended, then List<T> has inconsistent behavior. The RemoveRange() method only invalidates enumeration if items were actually removed:

static void Main(string [] args) {
    var list = new List<object>() {1, 2 };

    using(var enumerator = list.GetEnumerator()) {
    if(enumerator.MoveNext())
        Console.WriteLine(enumerator.Current);

    list.RemoveRange(1, 0);


    if(enumerator.MoveNext()) // ** Does not throw
        Console.WriteLine(enumerator.Current);
    }
}

Paul Phillips · Accepted Answer

I would guess that this is intentional. C# follows a "pit of success" design - they want it to be hard to make a mistake.

Ultimately, the existing design makes it easier to analyze the use of that method. What do I mean by that? Well:

The example you cite is trivial, at a glance you can tell it isn't really modifying the list. But in almost all real-world code that isn't the case. The sequence being inserted would almost certainly be dynamically created, and could be an empty sequence almost randomly. Should an empty sequence really behave differently? Your code would work if the sequence inserted was empty, but the moment you put something real in there, ka-boom.

Imagine if you first write this code and all your sequences are empty; looks like it works. Then, some arbitrary time later, you have a non-empty insert. Now you get exceptions. The distance between you inserting the problem and detecting it is potentially very large.

With it throwing on any successful call, that failure mode becomes easier to detect.

Mistake in List<T>.InsertRange(), or design decision?

Tags:

c#

.net

list

frameworks

implementation

Michael Graczyk

1 Answers

Paul Phillips

Recent Activity

Donate For Us