Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Remove sublist from a list

Tags:

c#

.net

I have 2 lists: list1 and list2 (both of type int)

Now I want to remove content of list2 from list1. How I can do this in C#?

PS: Don't use loop.

like image 843
Arry Avatar asked Jan 17 '13 21:01

Arry


2 Answers

IMPORTANT CHANGE

As was pointed out in the comments, .Except() uses a set internally, so any duplicate members of list1 will be absent in the final result.

Produces the set difference of two sequences

http://msdn.microsoft.com/en-us/library/system.linq.enumerable.except(v=vs.110).aspx

However, there is a solution that is both O(N) and preserves duplicates in the original list: Modify the RemoveAll(i => list2.Contains(i)) approach to use a HashSet<int> to hold the exclusion set.

List<int> list1 = Enumerable.Range(1, 10000000).ToList();
HashSet<int> exclusionSet = Enumerable.Range(500000, 10).ToHashSet(); 

list1.Remove(i => exclusionSet.Contains(i));

The extension method ToHashSet() is available in MoreLinq.

ORIGINAL ANSWER

You can use Linq

list1 = list1.Except(list2).ToList();

UPDATE

Out of curiosity I did a simple benchmark of my solution vs. @HighCore's.

For list2 having just one element, his code is faster. As list2 gets larger and larger, his code gets extremely slow. It looks like his is O(N-squared) (or more specifically O(list1.length*list2.length) since each item in list1 is compared to each item in list2). Don't have enough data points to check the Big-O of my solution, but it is much faster when list2 has more than a handful of elements.

Code used to test:

        List<int> list1 = Enumerable.Range(1, 10000000).ToList();
        List<int> list2 = Enumerable.Range(500000, 10).ToList(); // Gets MUCH slower as 10 increases to 100 or 1000

        Stopwatch sw = Stopwatch.StartNew();

        //list1 = list1.Except(list2).ToList();
        list1.RemoveAll(i => list2.Contains(i));

        sw.Stop();

        var ms1 = sw.ElapsedMilliseconds;

UPDATE 2

This solution assigns a new list to the variable list1. As @Толя points out, other references (if any) to the original list1 will not be updated. This solution drastically outperforms RemoveAll for all but the smallest sizes of list2. If no other references must see the update, it is preferable for that reason.

like image 158
Eric J. Avatar answered Sep 30 '22 12:09

Eric J.


list1.RemoveAll(x => list2.Contains(x));
like image 33
Federico Berasategui Avatar answered Sep 30 '22 13:09

Federico Berasategui