I want to compare an array of modified records against a list of records pulled from the database, and delete those records from the database that do not exist in the incoming array. The modified array comes from a client app that maintains the database, and this code runs in a WCF service app, so if the client deletes a record from the array, that record should be deleted from the database. Here's the sample code snippet:
public void UpdateRecords(Record[] recs)
{
// look for deleted records
foreach (Record rec in UnitOfWork.Records.ToList())
{
var copy = rec;
if (!recs.Contains(rec)) // use this one?
if (0 == recs.Count(p => p.Id == copy.Id)) // or this one?
{
// if not in the new collection, remove from database
Record deleted = UnitOfWork.Records.Single(p => p.Id == copy.Id);
UnitOfWork.Remove(deleted);
}
}
// rest of method code deleted
}
My question: is there a speed advantage (or other advantage) to using the Count method over the Contains method? the Id property is guaranteed to be unique and to identify that particular record, so you don't need to do a bitwise compare, as I assume Contains might do.
Anyone? Thanks, Dave
This would be faster:
if (!recs.Any(p => p.Id == copy.Id))
This has the same advantages as using Count()
- but it also stops after it finds the first match unlike Count()
You should not even consider Count
since you are only checking for the existence of a record. You should use Any
instead.
Using Count
forces to iterate the entire enumerable to get the correct count, Any
stops enumerating as soon as you found the first element.
As for the use of Contains
you need to take in consideration if for the specified type reference equality is equivalent to the Id
comparison you are performing. Which by default it is not.
Assuming Record
implements both GetHashCode
and Equals
properly, I'd use a different approach altogether:
// I'm assuming it's appropriate to pull down all the records from the database
// to start with, as you're already doing it.
foreach (Record recordToDelete in UnitOfWork.Records.ToList().Except(recs))
{
UnitOfWork.Remove(recordToDelete);
}
Basically there's no need to have an N * M lookup time - the above code will end up building a set of records from recs
based on their hash code, and find non-matches rather more efficiently than the original code.
If you've actually got more to do, you could use:
HashSet<Record> recordSet = new HashSet<Record>(recs);
foreach (Record recordFromDb in UnitOfWork.Records.ToList())
{
if (!recordSet.Contains(recordFromDb))
{
UnitOfWork.Remove(recordFromDb);
}
else
{
// Do other stuff
}
}
(I'm not quite sure why your original code is refetching the record from the database using Single
when you've already got it as rec
...)
Contains()
is going to use Equals()
against your objects. If you have not overridden this method, it's even possible Contains()
is returning incorrect results. If you have overridden it to use the object's Id
to determine identity, then in that case Count()
and Contains()
are almost doing the exact same thing. Except Contains()
will short circuit as soon as it hits a match, where as Count()
will keep on counting. Any()
might be a better choice than both of them.
Do you know for certain this is a bottleneck in your app? It feels like premature optimization to me. Which is the root of all evil, you know :)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With