Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How does the RemoveRange() method work in a List<>?

As in title. I know it probably merges 2 sublists before and after deleted items, but how does that method behave when removing LAST elements? In other words: does it somehow make a copy of all elements located before remove index? I'm just curious about perfomance of using RemoveRange on a giant List (let's say 5000 elements) just to remove f.e. only last 2 of them.

If it makes a copy then, is there a way to change some internal variable that sets size of the List (and treat rest of allocated elements as garbage)?

I only managed to find an info, that it's an O(n) complexity algorithm, but I'm not sure if the "n" by that case is a List size, or a number of items to delete.

Will be glad for any hint.

like image 412
Tarec Avatar asked Apr 17 '13 16:04

Tarec


2 Answers

What it will do is take each item after the end of the range of items to remove and move it up in the list by the number of items that were removed. In terms of performance implications there will be one copy for each item after the end of the range of items moved. This means that it will perform best when removing from the end (it'll be O(1)) and perform worst when removing from the start (O(n)).

As an example, consider the following list:

index - value

0 - A
1 - B
2 - C
3 - D
4 - E
5 - F
6 - G

If we call RemoveRange(2, 2) Then we're removing two items starting at index 2, so we're removing C and D.

This means E needs to be copied to index 2, then F needs to be copied to index 3, and G needs to be copied to index 4. There is one copy operation for each item after the last item removed.

Note that because of the fact that you can move the entire block of memory "up" by two this ends up being more efficient in practice that copying each item individually. It's a lot easier for a computers memory to move an entire block of memory up by some fixed number of bytes than to move lots of little sections of memory to different arbitrary locations. It will have the same asymptotic complexity though.

like image 146
Servy Avatar answered Oct 22 '22 12:10

Servy


List<T> is just a wrapper around an array, called _items in the code below. The array might have more slots in it than there are items in the list, so _size is used to keep track of how many are actually valid. When you do a RemoveRange(index, count)...

_size -= count;
if (index < _size)
    Array.Copy(_items, index + count, _items, index, _size - index);
Array.Clear(_items, _size, count);

...it copies the items from past the range into the now-empty space, and clears the old items.

If you are removing a range close to the beginning of a large list then the cost is proportional to the size of the list, because so much stuff has to be moved down.

If you are removing a large range close to the end of a large list then the copying is cheap but the clearing is expensive, so the cost will be proportional to the size of the range removed.

like image 36
Eric Lippert Avatar answered Oct 22 '22 13:10

Eric Lippert