Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Empty HashSet - Count vs Any

I am only interested to know whether a HashSet hs is empty or not. I am NOT interested to know exactly how many elements it contains.

So I could use this:

bool isEmpty = (hs.Count == 0);

...or this:

bool isEmpty = hs.Any(x=>true);

Which one provides better results, performance-wise(specially when the HashSet contains a large number of elements) ?

like image 764
Andy Avatar asked Aug 14 '13 15:08

Andy


2 Answers

On a HashSet you can use both, since HashSet internally manages the count.

However, if your data is in an IEnumerable<T> or IQueryable<T> object, using result.Any() is preferable over result.Count() (Both Linq Methods).

Linq's .Count() will iterate through the whole Enumerable, .Any() will only peek if any objects exists within the Enumerable or not.

Update: Just small addition: In your case with the HashSet .Count may be preferable as .Any() would require an IEmumerator to be created and returned which is a small overhead if you are not going to use the Enumerator anywhere in your code (foreach, Linq, etc.). But I think that would be considered "Micro optimization".

like image 165
Tseng Avatar answered Oct 11 '22 18:10

Tseng


HastSet<T> implements ICollection<T>, which has a Count property, so a call to Count() will just call HastSet<T>.Count, which I'm assuming is an O(1) operation (meaning it doesn't actually have to count - it just returns the current size of the HashSet).

Any will iterate until it finds an item that matches the condition, then stop.

So in your case, it will just iterate one item, then stop, so the difference will probably be negligible.

If you had a filter that you wanted to apply (e.g. x => x.IsValid) then Any would definitely be faster since Count(x => x.IsValid) would iterate over the entire collection, while Any would stop as soon as if finds a match.

For those reasons I generally prefer to use Any() rather than Count()==0 since it's more direct and avoids any potential performance problems. I would only switch to Count()==0 if it provided a significant performance boost over Any().

Note that Any(x=>true) is logically the same as calling Any(). That doesn't change your question, but it looks cleaner without the lambda.

like image 3
D Stanley Avatar answered Oct 11 '22 17:10

D Stanley