Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Using LINQ2SQL to insert a large number of records

We have a small c# tool that we threw together to parse a data file, build up some objects and insert them into the DB.

The logic is essentially.

string [] lines = File.ReadAllLines("C:\\Temp\\Data.dat")

foreach(string line in lines)
{
    MyDataObject obj = ParseObject(line);
    myDataContext.MyDataObjects.InsertOnSubmit(obj);
}

myDataContext.SubmitChanges();

This was fine at the outset as the data file was only ~1000 lines long each day but recently this file has grown to ~30,000 lines and the process has become painfully slow.

Everything up the SubmitChanges() call is fine, but once it starts the process of dumping 30,000 inserts to the DB, it just grinds to halt. As a test I dummied up 30,000 insert statements and ran them directly from QA. It took roughly 8 minutes.

After 8 minutes, the C# / Linq version had only completed about 25% of the inserts.

Anyone got any suggestions how I might go about optimizing this ?

like image 880
Eoin Campbell Avatar asked Jan 23 '23 13:01

Eoin Campbell


1 Answers

If you are writing a large volume of homogeneous data, SqlBulkCopy may be a more appropriate tool, for example perhaps with CsvReader to read lines (since SqlBulkCopy can accept an IDataReader, which means you don't have to buffer all 30k lines into memory).

If the data is CSV, this can be as simple as:

using (CsvReader reader = new CsvReader(path))
using (SqlBulkCopy bcp = new SqlBulkCopy(CONNECTION_STRING))
{
    bcp.DestinationTableName = "SomeTable";
    bcp.WriteToServer(reader);
}

If the data is more complex (not-CSV), then SimpleDataReader might be useful - you just subclass it and add the code to represent your data per row.

like image 80
Marc Gravell Avatar answered Jan 26 '23 03:01

Marc Gravell