Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Entity Framework update/insert multiple entities

Just a bit of an outline of what i am trying to accomplish. We keep a local copy of a remote database (3rd party) within our application. To download the information we use an api. We currently download the information on a schedule which then either inserts new records into the local database or updates the existing records. here is how it currently works

public void ProcessApiData(List<Account> apiData)
{
     // get the existing accounts from the local database
     List<Account> existingAccounts = _accountRepository.GetAllList();

     foreach(account in apiData)
     {
         // check if it already exists in the local database
         var existingAccount = existingAccounts.SingleOrDefault(a => a.AccountId == account.AccountId);

         // if its null then its a new record
         if(existingAccount == null)
         {
             _accountRepository.Insert(account);
             continue;
         }

         // else its a new record so it needs updating
         existingAccount.AccountName = account.AccountName;

         // ... continue updating the rest of the properties
     }

     CurrentUnitOfWork.SaveChanges();
}

This works fine, however it just feels like this could be improved.

  1. There is one of these methods per Entity, and they all do the same thing (just updating different properties) or inserting a different Entity. Would there be anyway to make this more generic?
  2. It just seems like a lot of database calls, would there be anyway to "Bulk" do this. I've had a look at this package which i have seen mentioned on a few other posts https://github.com/loresoft/EntityFramework.Extended But it seems to focus on bulk updating a single property with the same value, or so i can tell.

Any suggestions on how i can improve this would be brilliant. I'm still fairly new to c# so i'm still searching for the best way to do things.

I'm using .net 4.5.2 and Entity Framework 6.1.3 with MSSQL 2014 as the backend database

like image 554
tjackadams Avatar asked Sep 23 '16 09:09

tjackadams


3 Answers

For EFCore you can use this library:
https://github.com/borisdj/EFCore.BulkExtensions
Note: I'm the author of this one.

And for EF 6 this one:
https://github.com/TomaszMierzejowski/EntityFramework.BulkExtensions

Both are extending DbContext with Bulk operations and have the same syntax call:

context.BulkInsert(entitiesList);
context.BulkUpdate(entitiesList);
context.BulkDelete(entitiesList);

EFCore version have additionally BulkInsertOrUpdate method.

like image 134
borisdj Avatar answered Oct 13 '22 17:10

borisdj


  1. Assuming that the classes in apiData are the same as your entities, you should be able to use Attach(newAccount, originalAccount) to update an existing entity.
  2. For bulk inserts I use AddRange(listOfNewEntitities). If you have a lot of entities to insert it is advisable to batch them. Also you may want to dispose and recreate the DbContext on each batch so that it's not using too much memory.

    var accounts = new List<Account>();
    var context = new YourDbContext();
    context.Configuration.AutoDetectChangesEnabled = false;
    
    foreach (var account in apiData)
    {
        accounts.Add(account);
        if (accounts.Count % 1000 == 0) 
        // Play with this number to see what works best
        {
            context.Set<Account>().AddRange(accounts);
            accounts = new List<Account>();
            context.ChangeTracker.DetectChanges();
            context.SaveChanges();
            context?.Dispose();
            context = new YourDbContext();
        }
    }
    
    context.Set<Account>().AddRange(accounts);
    context.ChangeTracker.DetectChanges();
    context.SaveChanges();
    context?.Dispose();
    

For bulk updates, there's not anything built in in LINQ to SQL. There are however libraries and solutions to address this. See e.g. Here for a solution using expression trees.

like image 41
Hintham Avatar answered Oct 13 '22 17:10

Hintham


List vs. Dictionary

You check in a list every time if the entity exists which is bad. You should create a dictionary instead to improve performance.

var existingAccounts = _accountRepository.GetAllList().ToDictionary(x => x.AccountID);

Account existingAccount;

if(existingAccounts.TryGetValue(account.AccountId, out existingAccount))
{
    // ...code....
}

Add vs. AddRange

You should be aware of Add vs. AddRange performance when you add multiple records.

  • Add: Call DetectChanges after every record is added
  • AddRange: Call DetectChanges after all records is added

enter image description here

So at 10,000 entities, Add method have taken 875x more time to add entities in the context simply.

To fix it:

  1. CREATE a list
  2. ADD entity to the list
  3. USE AddRange with the list
  4. SaveChanges
  5. Done!

In your case, you will need to create an InsertRange method to your repository.

EF Extended

You are right. This library updates all data with the same value. That is not what you are looking for.

Disclaimer: I'm the owner of the project Entity Framework Extensions

This library may perfectly fit for your enterprise if you want to improve your performance dramatically.

You can easily perform:

  • BulkSaveChanges
  • BulkInsert
  • BulkUpdate
  • BulkDelete
  • BulkMerge

Example:

public void ProcessApiData(List<Account> apiData)
{
    // Insert or Update using the primary key (AccountID)
    CurrentUnitOfWork.BulkMerge(apiData);
}
like image 4
Jonathan Magnan Avatar answered Oct 13 '22 17:10

Jonathan Magnan