Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

C# Efficiently delete 50000 records in batches using SQLBulkCopy or equivalent library

I'm using this library to perform bulk delete in batches like following:

  while (castedEndedItems.Any())
  {
    var subList = castedEndedItems.Take(4000).ToList();
    DBRetry.Do(() => EFBatchOperation.For(ctx, ctx.SearchedUserItems).Where(r => subList.Any(a => a == r.ItemID)).Delete(), TimeSpan.FromSeconds(2));
    castedEndedItems.RemoveRange(0, subList.Count);
    Console.WriteLine("Completed a batch of ended items");
  }

As you can see guys I take a batch of 4000 items to delete at once and I pass them as argument to the query...

I'm using this library to perform bulk delete:

https://github.com/MikaelEliasson/EntityFramework.Utilities

However the performance like this is absolutely terrible... I tested the application couple of times and to delete the 80000 records for example it takes literally 40 minutes!?

I should note that that parameter by which I'm deleting (ItemID) is of varchar(400) type and it's indexed for performance reasons....

Is there any other library that I could possibly use or tweak this query to make it work faster, because currently the performance is absolutely terrible.. :/

like image 982
User987 Avatar asked Jan 28 '19 14:01

User987


People also ask

What C is used for?

C programming language is a machine-independent programming language that is mainly used to create many types of applications and operating systems such as Windows, and other complicated programs such as the Oracle database, Git, Python interpreter, and games and is considered a programming foundation in the process of ...

What is C language?

C is a structured, procedural programming language that has been widely used both for operating systems and applications and that has had a wide following in the academic community. Many versions of UNIX-based operating systems are written in C.

What is the full name of C?

In the real sense it has no meaning or full form. It was developed by Dennis Ritchie and Ken Thompson at AT&T bell Lab. First, they used to call it as B language then later they made some improvement into it and renamed it as C and its superscript as C++ which was invented by Dr.


Video Answer


2 Answers

If you are prepared to use a stored procedure then you can do this without any external library:

  • Create the sproc using a table valued parameter @ids
  • Define a SQL type for that table valued parameter (just an id column assuming a simple PK)
  • In the sproc use

    delete from table where id in (select id from @ids);
    
  • In your application create a DataTable and populate to match the SQL table

  • Pass the data table as an command parameter when calling the sproc.

This answer illustrates the process.

Any other option will need to do the equivalent of this – or something less efficient.

like image 169
Richard Avatar answered Sep 18 '22 13:09

Richard


any EF solution here is probably going to perform lots of discreet operations. Instead, I would suggest manually building your SQL in a loop, something like:

using(var cmd = db.CreateCommand())
{
    int index = 0;
    var sql = new StringBuilder("delete from [SomeTable] where [SomeId] in (");
    foreach(var item in items)
    {
        if (index != 0) sql.Append(',');
        var name = "@id_" + index++;
        sql.Append(name);
        cmd.Parameters.AddWithValue(name, item.SomeId);            
    }
    cmd.CommandText = sql.Append(");").ToString();
    cmd.ExecuteNonQuery();
}

You may need to loop this in batches, though, as there is an upper limit on the number of parameters allowed on a command.

like image 45
Marc Gravell Avatar answered Sep 18 '22 13:09

Marc Gravell