Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is the recommended batch size for SqlBulkCopy?

What is the recommended batch size for SqlBulkCopy? I'm looking for a general formula I can use as a starting point for performance tuning.

like image 859
Jonathan Allen Avatar asked Apr 22 '09 23:04

Jonathan Allen


People also ask

What is BCP batch size?

The performance statistics generated by the bcp utility show the packet size used. -b batch_size. Specifies the number of rows per batch of imported data. Each batch is imported and logged as a separate transaction that imports the whole batch before being committed.

What is SQL batch size?

For ongoing loads, you can use 20,000 rows for a large batch. You should not exceed 100,000 rows in a large batch. Furthermore, for Microsoft SQL Server and Oracle environments, you should limit the number of records in the EIM tables to those that are being processed.

What is batch size in bulk insert?

Batch size is the number of records in each batch. The rows in a batch are sent to the server at the end of each batch. The BatchSize property gets or sets the number of records to use in a batch. The following example save bulk data in batches of 1000 rows. context.BulkSaveChanges(options => options.BatchSize = 1000);

What is BatchSize in SqlBulkCopy?

BatchSize = 4000; By default, SqlBulkCopy will process the operation in a single batch. If you have 100000 rows to copy, 100000 rows will be copied at once. Not specifying a BatchSize can impact your application: Decrease SqlBulkCopy performance.


2 Answers

I have an import utility sitting on the same physical server as my SQL Server instance. Using a custom IDataReader, it parses flat files and inserts them into a database using SQLBulkCopy. A typical file has about 6M qualified rows, averaging 5 columns of decimal and short text, about 30 bytes per row.

Given this scenario, I found a batch size of 5,000 to be the best compromise of speed and memory consumption. I started with 500 and experimented with larger. I found 5000 to be 2.5x faster, on average, than 500. Inserting the 6 million rows takes about 30 seconds with a batch size of 5,000 and about 80 seconds with batch size of 500.

10,000 was not measurably faster. Moving up to 50,000 improved the speed by a few percentage points but it's not worth the increased load on the server. Above 50,000 showed no improvements in speed.

This isn't a formula, but it's another data point for you to use.

like image 179
Alric Avatar answered Oct 12 '22 09:10

Alric


This is an issue I have also spent some time looking into. I am looking to optimize importing large CSV files (16+ GB, 65+ million records, and growing) into a SQL Server 2005 database using a C# console application (.Net 2.0). As Jeremy has already pointed out, you will need to do some fine-tuning for your particular circumstances, but I would recommend you have an initial batch size of 500, and test values both above and below this.

I got the recommendation to test values between 100 and 1000 for batch size from this MSDN forum post, and was skeptical. But when I tested for batch sizes between 100 and 10,000, I found that 500 was the optimal value for my application. The 500 value for SqlBulkCopy.BatchSize is also recommended here.

To further optimize your SqlBulkCopy operation, check out this MSDN advice; I find that using SqlBulkCopyOptions.TableLock helps to reduce loading time.

like image 23
Tangiest Avatar answered Oct 12 '22 08:10

Tangiest