Checking a list with null values for duplicates in C#

Question

In C#, I can use something like:

List<string> myList = new List<string>();  if (myList.Count != myList.Distinct().Count()) {     // there are duplicates }

to check for duplicate elements in a list. However, when there are null items in list this produces a false positive. I can do this using some sluggish code but is there a way to check for duplicates in a list while disregarding null values with a concise way ?

Rawling · Accepted Answer

If you're worried about performance, the following code will stop as soon as it finds the first duplicate item - all the other solutions so far require the whole input to be iterated at least once.

var hashset = new HashSet<string>(); if (myList.Where(s => s != null).Any(s => !hashset.Add(s))) {     // there are duplicates }

hashset.Add returns false if the item already exists in the set, and Any returns true as soon as the first true value occurs, so this will only search the input as far as the first duplicate.

Dave Bish · Answer

I'd do this differently:

Given Linq statements will be evaluated lazily, the .Any will short-circuit - meaning you don't have to iterate & count the entire list, if there are duplicates - and as such, should be more efficient.

var dupes = myList     .Where(item => item != null)     .GroupBy(item => item)     .Any(g => g.Count() > 1);  if(dupes) {     //there are duplicates }

EDIT: http://pastebin.com/b9reVaJu Some Linqpad benchmarking that seems to conclude GroupBy with Count() is faster

EDIT 2: Rawling's answer below seems at least 5x faster than this approach!

Checking a list with null values for duplicates in C#

Tags:

c#

list

linq

Cemre Mengü

2 Answers

Rawling

Dave Bish

Recent Activity

Donate For Us

Checking a list with null values for duplicates in C#

Tags:

c#

list

linq

Cemre Mengü

2 Answers

Rawling

Dave Bish

Related questions

Recent Activity

Donate For Us