Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

SqlBulkCopy and Entity Framework

My current project consists of 3 standard layers: data, business, and presentation. I would like to use data entities for all my data access needs. Part of the functionality of the app will that it will need to copy all data within a flat file into a database. The file is not so big so I can use SqlBulkCopy. I have found several articles regarding the usage of SqlBulkCopy class in .NET. However, all the articles are using DataTables to move data back and forth.

Is there a way to use data entities along with SqlBulkCopy or will I have to use DataTables?

like image 704
Skadoosh Avatar asked Mar 31 '10 14:03

Skadoosh


2 Answers

You'll need to convert the Entities to a IDataReader or DataTable.

There is a small helper class designed to assist with with: http://archive.msdn.microsoft.com/LinqEntityDataReader/Release/ProjectReleases.aspx?ReleaseId=389

EDIT: msdn link is broken, alt copy can be found here: https://github.com/matthewschrager/Repository/blob/master/Repository.EntityFramework/EntityDataReader.cs

Then you can use SqlBulkCopy like so:

var sbCopy= new SqlBulkCopy(connectionString);
sbCopy.DestinationTableName = "TableName";
sbCopy.WriteToServer(entitiesList.AsDataReader()); 
like image 158
Dave Hogan Avatar answered Oct 12 '22 13:10

Dave Hogan


We've tried and tested a couple of approaches when bulk inserting using EF and eventually went with table-valued parameters for getting the best performance at a range of row sizes. I don't have the numbers to hand but I know this Performance of bcp/BULK INSERT vs. Table-Valued Parameters was a guiding factor.

We originally used SqlBulkCopy coupled with an adapter that took an IEnumerable<T> and created an IDataReader. It also generated the relevant metadata for SqlBulkCopy. Advantage here was that the import is a code only thing. The code that @davehogan posted was used as a basis for this.

Table-valued parameters require a stored procedure and a table-type defined in the database. If you're using code-first you can execute SQL to create these as part of your creation script. Whilst this is more work we found that we got a significantly more consistent and faster throughput of rows into the database.

Also, it's worth considering not bulk inserting into your main table. We use a temp heap table and add a clustered index to it once the data is imported. We then perform a MERGE between the temp table and the main table. This has the benefit of not locking the main table's index while inserting and improves concurrency. We tend to get upwards of 2500 rows/sec per CPU inserted using this method.

Let me know if you want more info.

like image 44
Dean Ward Avatar answered Oct 12 '22 13:10

Dean Ward