Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

List<object>.RemoveAll - How to create an appropriate Predicate

The RemoveAll() methods accept a Predicate<T> delegate (until here nothing new). A predicate points to a method that simply returns true or false. Of course, the RemoveAll will remove from the collection all the T instances that return True with the predicate applied.

C# 3.0 lets the developer use several methods to pass a predicate to the RemoveAll method (and not only this one…). You can use:

Lambda expressions

vehicles.RemoveAll(vehicle => vehicle.EnquiryID == 123);

Anonymous methods

vehicles.RemoveAll(delegate(Vehicle v) {
  return v.EnquiryID == 123;
});

Normal methods

vehicles.RemoveAll(VehicleCustomPredicate);
private static bool
VehicleCustomPredicate (Vehicle v) {
    return v.EnquiryID == 123; 
}

A predicate in T is a delegate that takes in a T and returns a bool. List<T>.RemoveAll will remove all elements in a list where calling the predicate returns true. The easiest way to supply a simple predicate is usually a lambda expression, but you can also use anonymous methods or actual methods.

{
    List<Vehicle> vehicles;
    // Using a lambda
    vehicles.RemoveAll(vehicle => vehicle.EnquiryID == 123);
    // Using an equivalent anonymous method
    vehicles.RemoveAll(delegate(Vehicle vehicle)
    {
        return vehicle.EnquiryID == 123;
    });
    // Using an equivalent actual method
    vehicles.RemoveAll(VehiclePredicate);
}

private static bool VehiclePredicate(Vehicle vehicle)
{
    return vehicle.EnquiryID == 123;
}

This should work (where enquiryId is the id you need to match against):

vehicles.RemoveAll(vehicle => vehicle.EnquiryID == enquiryId);

What this does is passes each vehicle in the list into the lambda predicate, evaluating the predicate. If the predicate returns true (ie. vehicle.EnquiryID == enquiryId), then the current vehicle will be removed from the list.

If you know the types of the objects in your collections, then using the generic collections is a better approach. It avoids casting when retrieving objects from the collections, but can also avoid boxing if the items in the collection are value types (which can cause performance issues).


Little bit off topic but say i want to remove all 2s from a list. Here's a very elegant way to do that.

void RemoveAll<T>(T item,List<T> list)
{
    while(list.Contains(item)) list.Remove(item);
}

With predicate:

void RemoveAll<T>(Func<T,bool> predicate,List<T> list)
{
    while(list.Any(predicate)) list.Remove(list.First(predicate));
}

+1 only to encourage you to leave your answer here for learning purposes. You're also right about it being off-topic, but I won't ding you for that because of there is significant value in leaving your examples here, again, strictly for learning purposes. I'm posting this response as an edit because posting it as a series of comments would be unruly.

Though your examples are short & compact, neither is elegant in terms of efficiency; the first is bad at O(n2), the second, absolutely abysmal at O(n3). Algorithmic efficiency of O(n2) is bad and should be avoided whenever possible, especially in general-purpose code; efficiency of O(n3) is horrible and should be avoided in all cases except when you know n will always be very small. Some might fling out their "premature optimization is the root of all evil" battle axes, but they do so naïvely because they do not truly understand the consequences of quadratic growth since they've never coded algorithms that have to process large datasets. As a result, their small-dataset-handling algorithms just run generally slower than they could, and they have no idea that they could run faster. The difference between an efficient algorithm and an inefficient algorithm is often subtle, but the performance difference can be dramatic. The key to understanding the performance of your algorithm is to understand the performance characteristics of the primitives you choose to use.

In your first example, list.Contains() and Remove() are both O(n), so a while() loop with one in the predicate & the other in the body is O(n2); well, technically O(m*n), but it approaches O(n2) as the number of elements being removed (m) approaches the length of the list (n).

Your second example is even worse: O(n3), because for every time you call Remove(), you also call First(predicate), which is also O(n). Think about it: Any(predicate) loops over the list looking for any element for which predicate() returns true. Once it finds the first such element, it returns true. In the body of the while() loop, you then call list.First(predicate) which loops over the list a second time looking for the same element that had already been found by list.Any(predicate). Once First() has found it, it returns that element which is passed to list.Remove(), which loops over the list a third time to yet once again find that same element that was previously found by Any() and First(), in order to finally remove it. Once removed, the whole process starts over at the beginning with a slightly shorter list, doing all the looping over and over and over again starting at the beginning every time until finally no more elements matching the predicate remain. So the performance of your second example is O(m*m*n), or O(n3) as m approaches n.

Your best bet for removing all items from a list that match some predicate is to use the generic list's own List<T>.RemoveAll(predicate) method, which is O(n) as long as your predicate is O(1). A for() loop technique that passes over the list only once, calling list.RemoveAt() for each element to be removed, may seem to be O(n) since it appears to pass over the loop only once. Such a solution is more efficient than your first example, but only by a constant factor, which in terms of algorithmic efficiency is negligible. Even a for() loop implementation is O(m*n) since each call to Remove() is O(n). Since the for() loop itself is O(n), and it calls Remove() m times, the for() loop's growth is O(n2) as m approaches n.