Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Speed up LINQ inserts

I have a CSV file and I have to insert it into a SQL Server database. Is there a way to speed up the LINQ inserts?

I've created a simple Repository method to save a record:

    public void SaveOffer(Offer offer)
    {
        Offer dbOffer = this.db.Offers.SingleOrDefault (
             o => o.offer_id == offer.offer_id);

        // add new offer
        if (dbOffer == null)
        {
            this.db.Offers.InsertOnSubmit(offer);
        }
        //update existing offer
        else
        {
            dbOffer = offer;
        }

        this.db.SubmitChanges();
    }

But using this method, the program is way much slower then inserting the data using ADO.net SQL inserts (new SqlConnection, new SqlCommand for select if exists, new SqlCommand for update/insert).

On 100k csv rows it takes about an hour vs 1 minute or so for the ADO.net way. For 2M csv rows it took ADO.net about 20 minutes. LINQ added about 30k of those 2M rows in 25 minutes. My database has 3 tables, linked in the dbml, but the other two tables are empty. The tests were made with all the tables empty.

P.S. I've tried to use SqlBulkCopy, but I need to do some transformations on Offer before inserting it into the db, and I think that defeats the purpose of SqlBulkCopy.

Updates/Edits: After 18hours, the LINQ version added just ~200K rows.

I've tested the import just with LINQ inserts too, and also is really slow compared with ADO.net. I haven't seen a big difference between just inserts/submitchanges and selects/updates/inserts/submitchanges.

I still have to try batch commit, manually connecting to the db and compiled queries.

like image 229
Marius Avatar asked Aug 25 '09 14:08

Marius


3 Answers

SubmitChanges does not batch changes, it does a single insert statement per object. If you want to do fast inserts, I think you need to stop using LINQ.

While SubmitChanges is executing, fire up SQL Profiler and watch the SQL being executed.

See question "Can LINQ to SQL perform batch updates and deletes? Or does it always do one row update at a time?" here: http://www.hookedonlinq.com/LINQToSQLFAQ.ashx

It links to this article: http://www.aneyfamily.com/terryandann/post/2008/04/Batch-Updates-and-Deletes-with-LINQ-to-SQL.aspx that uses extension methods to fix linq's inability to batch inserts and updates etc.

like image 145
Alex Black Avatar answered Sep 22 '22 23:09

Alex Black


Have you tried wrapping the inserts within a transaction and/or delaying db.SubmitChanges so that you can batch several inserts?

Transactions help throughput by reducing the needs for fsync()'s, and delaying db.SubmitChanges will reduce the number of .NET<->db roundtrips.

Edit: see http://www.sidarok.com/web/blog/content/2008/05/02/10-tips-to-improve-your-linq-to-sql-application-performance.html for some more optimization principles.

like image 43
Eamon Nerbonne Avatar answered Sep 24 '22 23:09

Eamon Nerbonne


Have a look at the following page for a simple walk-through of how to change your code to use a Bulk Insert instead of using LINQ's InsertOnSubmit() function.

You just need to add the (provided) BulkInsert class to your code, make a few subtle changes to your code, and you'll see a huge improvement in performance.

Mikes Knowledge Base - BulkInserts with LINQ

Good luck !

like image 38
Mike Gledhill Avatar answered Sep 24 '22 23:09

Mike Gledhill