We have a small c# tool that we threw together to parse a data file, build up some objects and insert them into the DB.
The logic is essentially.
string [] lines = File.ReadAllLines("C:\\Temp\\Data.dat")
foreach(string line in lines)
{
MyDataObject obj = ParseObject(line);
myDataContext.MyDataObjects.InsertOnSubmit(obj);
}
myDataContext.SubmitChanges();
This was fine at the outset as the data file was only ~1000 lines long each day but recently this file has grown to ~30,000 lines and the process has become painfully slow.
Everything up the SubmitChanges()
call is fine, but once it starts the process of dumping
30,000 inserts to the DB, it just grinds to halt. As a test I dummied up 30,000 insert statements and ran them directly from QA. It took roughly 8 minutes.
After 8 minutes, the C# / Linq version had only completed about 25% of the inserts.
Anyone got any suggestions how I might go about optimizing this ?
If you are writing a large volume of homogeneous data, SqlBulkCopy
may be a more appropriate tool, for example perhaps with CsvReader
to read lines (since SqlBulkCopy
can accept an IDataReader
, which means you don't have to buffer all 30k lines into memory).
If the data is CSV, this can be as simple as:
using (CsvReader reader = new CsvReader(path))
using (SqlBulkCopy bcp = new SqlBulkCopy(CONNECTION_STRING))
{
bcp.DestinationTableName = "SomeTable";
bcp.WriteToServer(reader);
}
If the data is more complex (not-CSV), then SimpleDataReader
might be useful - you just subclass it and add the code to represent your data per row.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With