Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

EntityFramework insert speed is very slow with large quantity of data

I am trying to insert about 50.000 rows to MS Sql Server db via Entity Framework 6.1.3 but it takes too long. I followed this answer. Disabled AutoDetectChangesEnabled and calling SaveChanges after adding every 1000 entities. It still takes about 7-8 minutes. I tried this with a remote server and local server. There is not much difference. I don't think that this is normal. Am I forgot something?

Here is my code:

static void Main(string[] args)
    {

        var personCount = 50000;
        var personList = new List<Person>();
        var random = new Random();

        for (int i = 0; i < personCount; i++)
        {
            personList.Add(new Person
            {
                CreateDate = DateTime.Now,
                DateOfBirth = DateTime.Now,
                FirstName = "Name",
                IsActive = true,
                IsDeleted = false,
                LastName = "Surname",
                PhoneNumber = "01234567890",
                PlaceOfBirth = "Trabzon",
                Value0 = random.NextDouble(),
                Value1 = random.Next(),
                Value10 = random.NextDouble(),
                Value2 = random.Next(),
                Value3 = random.Next(),
                Value4 = random.Next(),
                Value5 = random.Next(),
                Value6 = "Value6",
                Value7 = "Value7",
                Value8 = "Value8",
                Value9 = random.NextDouble()
            });
        }

        MyDbContext context = null;

        try
        {
            context = new MyDbContext();
            context.Configuration.AutoDetectChangesEnabled = false;

            int count = 0;
            foreach (var entityToInsert in personList)
            {
                ++count;
                context = AddToContext(context, entityToInsert, count, 1000, true);
            }

            context.SaveChanges();
        }
        finally
        {
            if (context != null)
                context.Dispose();
        }

    }

    private static MyDbContext AddToContext(MyDbContext context, Person entity, int count, int commitCount, bool recreateContext)
    {
        context.Set<Person>().Add(entity);

        if (count % commitCount == 0)
        {
            context.SaveChanges();
            if (recreateContext)
            {
                context.Dispose();
                context = new MyDbContext();
                context.Configuration.AutoDetectChangesEnabled = false;
            }
        }

        return context;
    } 

Person class:

public class Person
{
    public int Id { get; set; }

    [MaxLength(50)]
    public string FirstName { get; set; }

    [MaxLength(50)]
    public string LastName { get; set; }

    public DateTime DateOfBirth { get; set; }

    [MaxLength(50)]
    public string PlaceOfBirth { get; set; }

    [MaxLength(15)]
    public string PhoneNumber { get; set; }

    public bool IsActive { get; set; }

    public DateTime CreateDate { get; set; }

    public int Value1 { get; set; }

    public int Value2 { get; set; }

    public int Value3 { get; set; }

    public int Value4 { get; set; }

    public int Value5 { get; set; }

    [MaxLength(50)]
    public string Value6 { get; set; }

    [MaxLength(50)]
    public string Value7 { get; set; }

    [MaxLength(50)]
    public string Value8 { get; set; }

    public double Value9 { get; set; }

    public double Value10 { get; set; }

    public double Value0 { get; set; }

    public bool IsDeleted { get; set; }
}

Query tracked from profiler:

exec sp_executesql N'INSERT [dbo].[Person]([FirstName], [LastName],       [DateOfBirth], [PlaceOfBirth], [PhoneNumber], [IsActive], [CreateDate],     [Value1], [Value2], [Value3], [Value4], [Value5], [Value6], [Value7], [Value8],     [Value9], [Value10], [Value0], [IsDeleted])
VALUES (@0, @1, @2, @3, @4, @5, @6, @7, @8, @9, @10, @11, @12, @13, @14, @15, @16, @17, @18)
SELECT [Id]
FROM [dbo].[Person]
WHERE @@ROWCOUNT > 0 AND [Id] = scope_identity()',N'@0 nvarchar(50),@1     nvarchar(50),@2 datetime2(7),@3 nvarchar(50),@4 nvarchar(15),@5 bit,@6 datetime2(7),@7 int,@8 int,@9 int,@10 int,@11 int,@12 nvarchar(50),@13 nvarchar(50),@14 nvarchar(50),@15 float,@16 float,@17 float,@18 bit',@0=N'Name',@1=N'Surname',@2='2017-01-19 10:59:09.9882591',@3=N'Trabzon',@4=N'01234567890',@5=1,@6='2017-01-19 10:59:09.9882591',@7=731825903,@8=1869842619,@9=1701414555,@10=1468342767,@11=1962019787,@12=N'Value6',@13=N'Value7',@14=N'Value8',@15=0,65330243467041405,@16=0,85324223938083377,@17=0,7146566792925152,@18=0

I want to solve this only with EF. I now there is plenty of alternatives. But lets assume that there is no other chances.

The main problem here is, I used same approach with answer I referenced. It inserts 560000 entities in 191 secs. But I can only insert 50000 in 7 minutes.

like image 458
Gökberk Konuralp Avatar asked Jan 19 '17 08:01

Gökberk Konuralp


People also ask

Why is Entity Framework first load slow?

Entity Framework loads very slowly the first time because the first query EF compiles the model. If you are using EF 6.2, you can use a Model Cache which loads a prebuilt edmx when using code first; instead, EF generates it on startup.

Is EF core faster than ef6?

EF Core 6.0 itself is 31% faster executing queries. Heap allocations have been reduced by 43%.

When should you not use Efcore?

One of the biggest reasons not to use Entity Framework Core is that your application needs the fastest possible data access. Some applications do a lot of heavy data operations with very high-performance demands, but usually business applications don't have that high of a performance demand.


1 Answers

You already got rid of the ChangeTracker problem by disabling AutoDetectChanges.

I normally recommend using one of theses solutions:

  1. AddRange over Add (Recommended)
  2. SET AutoDetectChanges to false
  3. SPLIT SaveChanges in multiple batches

See: http://entityframework.net/improve-ef-add-performance

Making multiple batches won't really improve or decrease performance since you already set AutoDectectChanges to false.

The major problem is that Entity Framework makes a database round-trip for every entity you need to insert. So, if you INSERT 50,000 entities, 50,000 database round-trips are performed which is INSANE.

What you need to do to solve your issue is reducing the number of database round-trips.

One free way to do it is using SqlBulkCopy: https://msdn.microsoft.com/en-us/library/system.data.sqlclient.sqlbulkcopy(v=vs.110).aspx


Disclaimer: I'm the owner of Entity Framework Extensions

This library allows you to perform all bulk operations you need for your scenarios:

  • Bulk SaveChanges
  • Bulk Insert
  • Bulk Delete
  • Bulk Update
  • Bulk Merge

You will be able to insert 50,000 entities in a few seconds.

Example

// Easy to use
context.BulkSaveChanges();

// Easy to customize
context.BulkSaveChanges(bulk => bulk.BatchSize = 100);

// Perform Bulk Operations
context.BulkDelete(customers);
context.BulkInsert(customers);
context.BulkUpdate(customers);

// Customize Primary Key
context.BulkMerge(customers, operation => {
   operation.ColumnPrimaryKeyExpression = 
        customer => customer.Code;
});

Does your extension do that? Or do you know any other way of save many rows with geometry columns efficiently?

Yes, our library supports geometry columns.

like image 54
Jonathan Magnan Avatar answered Sep 29 '22 12:09

Jonathan Magnan