I need to insert around 2500 rows using EF Code First.
My original code looked something like this:
foreach(var item in listOfItemsToBeAdded)
{
//biz logic
context.MyStuff.Add(i);
}
This took a very long time. It was around 2.2 seconds for each DBSet.Add()
call, which equates to around 90 minutes.
I refactored the code to this:
var tempItemList = new List<MyStuff>();
foreach(var item in listOfItemsToBeAdded)
{
//biz logic
tempItemList.Add(item)
}
context.MyStuff.ToList().AddRange(tempItemList);
This only takes around 4 seconds to run. However, the .ToList()
queries all the items currently in the table, which is extremely necessary and could be dangerous or even more time consuming in the long run. One workaround would be to do something like context.MyStuff.Where(x=>x.ID = *empty guid*).AddRange(tempItemList)
because then I know there will never be anything returned.
But I'm curious if anyone else knows of an efficient way to to a bulk insert using EF Code First?
Validation is normally a very expensive portion of EF, I had great performance improvements by disabling it with:
context.Configuration.AutoDetectChangesEnabled = false;
context.Configuration.ValidateOnSaveEnabled = false;
I believe I found that in a similar SO question--perhaps it was this answer
Another answer on that question rightly points out that if you really need bulk insert performance you should look at using System.Data.SqlClient.SqlBulkCopy
. The choice between EF and ADO.NET for this issue really revolves around your priorities.
I have a crazy idea but I think it will help you.
After each adding 100 items call SaveChanges. I have a feeling Track Changes in EF have a very bad performance with huge data.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With