Which is faster in .NET, .Contains() or .Count()?





I want to compare an array of modified records against a list of records pulled from the database, and delete those records from the database that do not exist in the incoming array. The modified array comes from a client app that maintains the database, and this code runs in a WCF service app, so if the client deletes a record from the array, that record should be deleted from the database. Here's the sample code snippet:

public void UpdateRecords(Record[] recs)
    // look for deleted records
    foreach (Record rec in UnitOfWork.Records.ToList())
        var copy = rec;
        if (!recs.Contains(rec))                      // use this one?
        if (0 == recs.Count(p => p.Id == copy.Id))    // or this one?
            // if not in the new collection, remove from database
            Record deleted = UnitOfWork.Records.Single(p => p.Id == copy.Id);
    // rest of method code deleted

My question: is there a speed advantage (or other advantage) to using the Count method over the Contains method? the Id property is guaranteed to be unique and to identify that particular record, so you don't need to do a bitwise compare, as I assume Contains might do.

Anyone? Thanks, Dave

4 Answers

This would be faster:

if (!recs.Any(p => p.Id == copy.Id)) 

This has the same advantages as using Count() - but it also stops after it finds the first match unlike Count()

You should not even consider Count since you are only checking for the existence of a record. You should use Any instead.

Using Count forces to iterate the entire enumerable to get the correct count, Any stops enumerating as soon as you found the first element.

As for the use of Contains you need to take in consideration if for the specified type reference equality is equivalent to the Id comparison you are performing. Which by default it is not.

Assuming Record implements both GetHashCode and Equals properly, I'd use a different approach altogether:

// I'm assuming it's appropriate to pull down all the records from the database
// to start with, as you're already doing it.
foreach (Record recordToDelete in UnitOfWork.Records.ToList().Except(recs))

Basically there's no need to have an N * M lookup time - the above code will end up building a set of records from recs based on their hash code, and find non-matches rather more efficiently than the original code.

If you've actually got more to do, you could use:

HashSet<Record> recordSet = new HashSet<Record>(recs);

foreach (Record recordFromDb in UnitOfWork.Records.ToList())
    if (!recordSet.Contains(recordFromDb))
        // Do other stuff

(I'm not quite sure why your original code is refetching the record from the database using Single when you've already got it as rec...)

Contains() is going to use Equals() against your objects. If you have not overridden this method, it's even possible Contains() is returning incorrect results. If you have overridden it to use the object's Id to determine identity, then in that case Count() and Contains() are almost doing the exact same thing. Except Contains() will short circuit as soon as it hits a match, where as Count() will keep on counting. Any() might be a better choice than both of them.

Do you know for certain this is a bottleneck in your app? It feels like premature optimization to me. Which is the root of all evil, you know :)

