Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How does SqlBulkCopy Work

I am familiar with the C# SqlBulkCopy class where you can call the 'WriteToServer' method passing through a DataTable.

My question is what underlying mechanism in SQL server is used to bulk insert that data?

The reason I ask is that the bulk insert referenced in the Bulk Insert MSDN T-SQL help file requires a data file to import. Does the SqlBulkCopy create a data file?

I would like to understand this stuff to work out whether I can use the bulk insert functionality in SQL.

If I write a SQL statement that prepares all the rows to insert into a particular table (thousands of rows) can I bulk insert them into the destination table? Something like this is how I am doing it now,

INSERT INTO sync_filters (table_name, device_id, road_id, contract_id)
    SELECT * FROM dbo.sync_contract_filters (@device_id)

And the dbo.sync_contract_filters is a function to generate all the rows to insert. Can this be bulk inserted?

like image 455
peter Avatar asked Aug 29 '12 23:08

peter


People also ask

How does SqlBulkCopy update data?

Upload the data to the temporary table, then perform the SqlBulkCopy update. Using SqlBulkCopy(), upload the datatable's data to the temporary table. Then execute a SQL command to update the main table's data from the temporary table. Finally drop the temporary table.

Does SqlBulkCopy use transaction?

By default, a bulk copy operation is its own transaction. When you want to perform a dedicated bulk copy operation, create a new instance of SqlBulkCopy with a connection string, or use an existing SqlConnection object without an active transaction.

How Bulk Copy works?

SqlBulkCopy does not create a data file. It streams the data table directly from the . Net DataTable object to the server using the available communication protocol (Named Pipes, TCP/IP, etc...) and insert the data to the destination table in bulk using the same technique used by BCP.

What is SqlBulkCopy in VB net?

SqlBulkCopy lets you bulk load a SQL Server table with data from another source. This means that it copies (in bulk) data from another source into an SQL database table.


2 Answers

SqlBulkCopy does not create a data file. It streams the data table directly from the .Net DataTable object to the server using the available communication protocol (Named Pipes, TCP/IP, etc...) and insert the data to the destination table in bulk using the same technique used by BCP.

like image 195
Sam Anwar Avatar answered Oct 06 '22 13:10

Sam Anwar


It took 7 years, but we finally have an answer...

Expounding upon Sam Anwar's answer, I can confirm it is converting the data to a raw byte stream and writing it to SQL as if it were streaming in from a file. How it tricks SQL into thinking it's reading a file is beyond me.

I wanted to do a bulk insert from inside a query, to speed up a slow clustered index insert. Upon finding your post here, somehow I became disturbingly intrigued, so I spent the past several hours studying it.

The execution path that actually writes data to the server seems to be:

Your Code:

  1. Your code calls System.Data.SqlClient.SqlBulkCopy.WriteToServer()

inside System.Data.SqlClient.SqlBulkCopy:

  1. which calls WriteRowSourceToServerAsync()
  2. which calls WriteRowSourceToServerCommon() to map the columns and WriteToServerInternalAsync() to write the data
  3. which calls WriteToServerInternalRestContinuedAsync()
  4. which calls AnalyzeTargetAndCreateUpdateBulkCommand() (This is the answer. Skip to step 14 to read about it.) and CopyBatchesAsync()
  5. which (CopyBatchesAsync) calls SubmitBulkUpdateCommand()

-- inside System.Data.SqlClient.TdsParser:

  1. which calls System.Data.SqlClient.TdsParser.TdsExecuteSQLBatch()
  2. which calls WriteString() or similar methods to convert the data into a byte array
  3. which calls WriteByteArray()
  4. which calls WritePacket()
  5. which calls WriteSni()
  6. which calls SNIWritePacket()

-- inside System.Data.SqlClient.SNINativeMethodWrapper:

  1. which calls System.Data.SqlClient.SNINativeMethodWrapper.SNIWritePacket()
  2. which extern calls SNIWriteAsyncWrapper() or SNIWriteSyncOverAsync()

Now here's where it gets tricky. I think this follows, but how I got there is a bit hacky. I opened the file properties on my copy of sni.dll, went to the details tab, and inside the Product Version property I found a reference to a "commit hash" of d0d5c7b49271cadb6d97de26d8e623e98abdc8db.

So I googled that hash, and via this Nuget search I found this Nuget package, whose title includes "System.Data.SqlClient.sni", which implies the namespace System.Data.SqlClient.SNI, which I found here, but this doesn't have the right methods and doesn't actually seem to communicate with a server.

So this is where I ran out of know-how; this is as deep as I could get before it goes into native code I can't find anywhere. And although I'm not sure what all that other noise up above was...

  1. Remember Step 4 (WriteToServerInternalRestContinuedAsync()) also calls AnalyzeTargetAndCreateUpdateBulkCommand()
  2. which concatenates a SQL query inside a StringBuilder named updateBulkCommandText. Line 544 in that last link.

TLDR: Ultimately it appears it just executes an INSERT BULK query (which does not require a file), and does not actually use BULK INSERT (which does). Note these two commands look very similar.

An important note in the Microsoft docs:

Used by external tools to upload a binary data stream. This option is not intended for use with tools such as SQL Server Management Studio, SQLCMD, OSQL, or data access application programming interfaces such as SQL Server Native Client.

Which I interpret as "use at your own risk and don't expect help". Which is almost as good as a green light, in all fairness.

like image 20
tsilb Avatar answered Oct 06 '22 13:10

tsilb