I was playing around with Entity Framework 6 on my home computer and decided to try out inserting a fairly large amount of rows, around 430k.
My first try looked like this, yes I know it can be better but it was for research anyway:
var watch = System.Diagnostics.Stopwatch.StartNew();
foreach (var event in group)
{
db.Events.Add(event);
db.SaveChanges();
}
var dbCount = db.Events.Count(x => x.ImportInformation.FileName == group.Key);
if (dbCount != group.Count())
{
throw new Exception("Mismatch between rows added for file and current number of rows!");
}
watch.Stop();
Console.WriteLine($"Added {dbCount} events to database in {watch.Elapsed.ToString()}");
Started it in the evening and checked back when I got home from work. This was the result:
As you can see 64523 events were added in the first 4 hours and 41 minutes but then it got a lot slower and the next 66985 events took 14 hours and 51 minutes. I checked the database and the program was still inserting events but at an extremely low speed. I then decided to try the "new" AddRange method for DbSet.
I switched my models from IDbSet
to DbSet
and replaced the foreach
loop with this:
db.Events.AddRange(group);
db.SaveChanges();
I could now add 60k+ events in around 30 seconds. It is perhaps not SqlBulkCopy fast but it is still a huge improvement. What is happening under the hood to achieve this? I thought I was gonna check SQL Server Profiler
tomorrow for queries but It would be nice with an explanation what happens in code as well.
Intuitively, a DbContext corresponds to your database (or a collection of tables and views in your database) whereas a DbSet corresponds to a table or view in your database.
The DbSet class represents an entity set that can be used for create, read, update, and delete operations. The context class (derived from DbContext ) must include the DbSet type properties for the entities which map to database tables and views.
A DbSet represents the collection of all entities in the context, or that can be queried from the database, of a given type. DbSet objects are created from a DbContext using the DbContext.
AddRange() method attaches a collection of entities to the context with Added state, which will execute the INSERT command in the database for all entities on SaveChanges() . In the same way, the DbSet.
As Jakub answered, calling SaveChanges after every added entity was not helping. But you would still get some performance problems even if you move it out. That will not fix the performance issue caused by the Add method.
That's a very common error to use the Add method to add multiple entities. In fact, it's the DetectChanges method that's INSANELY slow.
See: Entity Framework - Performance Add
It is perhaps not SqlBulkCopy fast, but it is still a huge improvement
It's possible to get performance VERY close to SqlBulkCopy.
Disclaimer: I'm the owner of the project Entity Framework Extensions
(This library is NOT free)
This library can make your code more efficient by allowing you to save multiples entities at once. All bulk operations are supported:
Example:
// Easy to use
context.BulkSaveChanges();
// Easy to customize
context.BulkSaveChanges(bulk => bulk.BatchSize = 100);
// Perform Bulk Operations
context.BulkDelete(customers);
context.BulkInsert(customers);
context.BulkUpdate(customers);
// Customize Primary Key
context.BulkMerge(customers, operation => {
operation.ColumnPrimaryKeyExpression =
customer => customer.Code;
});
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With