Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Best way to compare two large string lists, using C# and LINQ?

Tags:

c#

linq

I have a large list (~ 110,000 strings), which I need to compare to a similar sized list.

List A comes from 1 system. List B comes from a SQL table (I can only read, no stored procs, etc)

What is the best way to find what values are in list A, that no longer exists in list B?

Is 100,000 strings a large number to be handled in an array?

thanks

like image 944
Donaldinio Avatar asked Jan 13 '10 20:01

Donaldinio


4 Answers

So you have two lists like so:

List<string> listA;
List<string> listB;

Then use Enumerable.Except:

List<string> except = listA.Except(listB).ToList();

Note that if you want to, say, ignore case:

List<string> except = listA.Except(listB, StringComparer.OrdinalIgnoreCase).ToList();

You can replace the last parameter with an IEqualityComparer<string> of your choosing.

like image 80
jason Avatar answered Oct 25 '22 11:10

jason


With LINQ:

var missing = listA.Except(listB).ToList();
like image 24
Marc Gravell Avatar answered Oct 25 '22 10:10

Marc Gravell


Out of interest, do you HAVE to use List<string>? Because in .net 3.5 SP1, you can use the HashSet and it's ExceptWith method. To my understanding, HashSets are specifically optimized for comparisons between two Sets.

like image 6
Michael Stum Avatar answered Oct 25 '22 12:10

Michael Stum


List<string> A = //get from file
List<string> B = //get from db

var C = A.Except(B);
like image 2
Paul Creasey Avatar answered Oct 25 '22 12:10

Paul Creasey