What is the recommended batch size for SqlBulkCopy
? I'm looking for a general formula I can use as a starting point for performance tuning.
The performance statistics generated by the bcp utility show the packet size used. -b batch_size. Specifies the number of rows per batch of imported data. Each batch is imported and logged as a separate transaction that imports the whole batch before being committed.
For ongoing loads, you can use 20,000 rows for a large batch. You should not exceed 100,000 rows in a large batch. Furthermore, for Microsoft SQL Server and Oracle environments, you should limit the number of records in the EIM tables to those that are being processed.
Batch size is the number of records in each batch. The rows in a batch are sent to the server at the end of each batch. The BatchSize property gets or sets the number of records to use in a batch. The following example save bulk data in batches of 1000 rows. context.BulkSaveChanges(options => options.BatchSize = 1000);
BatchSize = 4000; By default, SqlBulkCopy will process the operation in a single batch. If you have 100000 rows to copy, 100000 rows will be copied at once. Not specifying a BatchSize can impact your application: Decrease SqlBulkCopy performance.
I have an import utility sitting on the same physical server as my SQL Server instance. Using a custom IDataReader
, it parses flat files and inserts them into a database using SQLBulkCopy
. A typical file has about 6M qualified rows, averaging 5 columns of decimal and short text, about 30 bytes per row.
Given this scenario, I found a batch size of 5,000 to be the best compromise of speed and memory consumption. I started with 500 and experimented with larger. I found 5000 to be 2.5x faster, on average, than 500. Inserting the 6 million rows takes about 30 seconds with a batch size of 5,000 and about 80 seconds with batch size of 500.
10,000 was not measurably faster. Moving up to 50,000 improved the speed by a few percentage points but it's not worth the increased load on the server. Above 50,000 showed no improvements in speed.
This isn't a formula, but it's another data point for you to use.
This is an issue I have also spent some time looking into. I am looking to optimize importing large CSV files (16+ GB, 65+ million records, and growing) into a SQL Server 2005 database using a C# console application (.Net 2.0). As Jeremy has already pointed out, you will need to do some fine-tuning for your particular circumstances, but I would recommend you have an initial batch size of 500, and test values both above and below this.
I got the recommendation to test values between 100 and 1000 for batch size from this MSDN forum post, and was skeptical. But when I tested for batch sizes between 100 and 10,000, I found that 500 was the optimal value for my application. The 500 value for SqlBulkCopy.BatchSize
is also recommended here.
To further optimize your SqlBulkCopy operation, check out this MSDN advice; I find that using SqlBulkCopyOptions.TableLock helps to reduce loading time.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With