I've have SSIS Package that is exporting 2.5 GB OF DATA containing 10 million records into Sql Server Database which has 10 partitions including PRIMARY FILE GROUP.
Before Changing default Max Insert Commit size i.e."2147483647" and Row per batch.It was taking 7 mins for completed transformation with fast load option.
But After chaning it some decent value with some formula, the execution was done in only 2 minutes.
FYI- DefaultMaxBufferRows & DefaultMaxBufferSize were default value in both scenorio i.e. 10000 and 10 MB respectively.
To calculate Max Insert Commit size & Row per batch Below calucation are used.
1) Calculated length of records from source that is being transfered. which comes around 1038 bytes.
CREATE TABLE [dbo].[Game_DATA2](
[ID] [int] IDENTITY(1,1) NOT NULL, -- AUTO CALCULATED
[Number] [varchar](255) NOT NULL, -- 255 bytes
[AccountTypeId] [int] NOT NULL, -- 4 bytes
[Amount] [float] NOT NULL,-- 4 bytes
[CashAccountNumber] [varchar](255) NULL, -- 255 bytes
[StartDate] [datetime] NULL,-- 8 bytes
[Status] [varchar](255) NOT NULL,-- 255 bytes
[ClientCardNumber] [varchar](255) NULL -- 255 bytes
)
2) Rows per batch = packate_size/bytes per record =32767/1038 =32 approx.
3) Max insert commit size = packate size *number of transaction = 32767*100=3276700 (Packate size and number transaction are variable can change as per requirement)
Question :
Is there any relevance of rows per batch and max insert commit size? As there's no information mentioned in an archive article for tunning DFT(DATA FLOW TASK) execution.
Are these configuration works along with DefaultBuffermaxzie and
DefualtBuffermaxrows?if yes how?
These parameters refer to DFT OLE DB Destination with Fast Load mode only. OLE DB Destination in Fast Load issues an insert bulk
command. These two parameters control it in the following way:
BULK INSERT (Transact-SQL) - MS Article on this command.
DefaultBuffermaxsize and DefaultBuffermaxrows controls RAM buffer management inside DFT itself, and has no interference with options mentioned above.
Rows per batch - The default value for this setting is -1 which specifies all incoming rows will be treated as a single batch. You can change this default behavior and break all incoming rows into multiple batches. The allowed value is only positive integer which specifies the maximum number of rows in a batch.
Maximum insert commit size - The default value for this setting is '2147483647' (largest value for 4 byte integer type) which specifies all incoming rows will be committed once on successful completion. You can specify a positive value for this setting to indicate that commit will be done for those number of records. You might be wondering, changing the default value for this setting will put overhead on the dataflow engine to commit several times. Yes that is true, but at the same time it will release the pressure on the transaction log and tempdb to grow tremendously specifically during high volume data transfers.
The above two settings are very important to understand to improve the performance of tempdb and the transaction log. For example if you leave 'Max insert commit size' to its default, the transaction log and tempdb will keep on growing during the extraction process and if you are transferring a high volume of data the tempdb will soon run out of memory as a result of this your extraction will fail. So it is recommended to set these values to an optimum value based on your environment.
Note: The above recommendations have been done on the basis of experience gained working with DTS and SSIS for the last couple of years. But as noted before there are other factors which impact the performance, one of the them is infrastructure and network. So you should do thorough testing before putting these changes into your production environment.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With