The following code:
List<Interval> intervals = new List<Interval>();
List<int> points = new List<int>();
//Initialization of the two lists
// [...]
foreach (var point in points)
{
intervals.RemoveAll (x => x.Intersects (point));
}
is at least 100x faster than this when the lists are of size ~10000:
List<Interval> intervals = new List<Interval>();
List<int> points = new List<int>();
//Initialization of the two lists
// [...]
foreach (var point in points)
{
for (int i = 0; i < intervals.Count;)
{
if (intervals[i].Intersects(point))
{
intervals.Remove(intervals[i]);
}
else
{
i++;
}
}
}
How is it possible? What is performed under the hood with "RemoveAll"? According to MSDN, "RemoveAll" performs a linear search and is therefore in O(n). So I would expect similar performance for both.
When replacing "Remove" by "RemoveAt", the iteration is much faster, comparable to "RemoveAll". But both "Remove" and "RemoveAt" have O(n) complexity, so why is the performance difference between them so big? Could it only be due to the fact that "Remove (item)" compares the list elements with "item" and "RemoveAt" doesn't perform any comparison?
If you remove an item from a List<T>
, all the items after it will be moved back one spot. So if you remove n items, a lot of items will be moved n times.
RemoveAll will only do the moving once, which you can see in the source for List<T>
: source
Another thing is that Remove(T item)
will search the entire List for the item, so that's another n operations.
Something that has nothing to do with your question, but I'd like to point out anyway:
If you use a for-loop to delete items from a List, it's a lot easier to start at the end:
for (int i = intervals.Count - 1; i >= 0; i--)
{
if (intervals[i].Intersects(point))
{
intervals.RemoveAt(i);
}
}
This way, you don't need that ugly else-clause
RemoveAll
can be done in O(n)
by checking the condition for n
elements and moving at most n
elements.
Your loop is O(n^2)
, as each Remove
needs to check up to n
elements. And even if you change it to RemoveAt
, it still needs to move up to n
elements.
This might be the fastest solution:
intervals.RemoveAll(x => points.Any(x.Intersects));
List
is an array, and removing one element from an array requires moving all elements after the one you're removing to the previous index, so a[i]
is moved to a[i-1]
.
Doing this repeatedly requires multiple moves, even if more elements meet the removal criteria. RemoveAll
may optimize this by moving the elements by more than 1 index at a time as it traverses the list and finds more elements that match the removal criteria.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With