Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Insert / Update 50 Millions records in SQL Server fastest way

Tags:

sql

sql-server

I'm trying to insert and update 50 Million into a table using Store procedure from staging table to base table. But the process is taking much much longer than expected approx 5-6 hours.

I tried implementing few indexes for performance tuning but none of them are working on the first time.

Any suggestion on how to handle and do the large data insertion.

Our SSIS program dumps data from CSV dumps inside the staging table and then our store procedure run in the end, the below insertion is causing time delay,

Code snippet - Staging to base table insertion

    INSERT INTO [dbo].[Entries] (
        [TransactionID]
        ,[TimekID]
        ,[ResourceID]
        ,[INVOICE_DATE]
        ,[INVOICE_NUMBER]
        ,[CLIENT_ID]
        ,[LAW_FIRM_MATTER_ID]
        ,[INVOICE_TOTAL]
        ,[BILLING_START_DATE]
        ,[BILLING_END_DATE]
        ,[INVOICE_DESCRIPTION]
        ,[ITEM_NUMBER]
        ,[EXP_FEE_INV_ADJ_TYPE]
        ,[ITEM_NUMBER_OF_UNITS]
        ,[ITEM_ADJUSTMENT_AMOUNT]
        ,[ITEM_TOTAL]
        ,[ITEM_DATE]
        ,[ITEM_TimeK_CODE]
        ,[ITEM_EXPENSE_CODE]
        ,[ITEM_ACTIVITY_CODE]
        ,[TIMEKEEPER_ID]
        ,[ITEM_DESCRIPTION]
        ,[LAW_FIRM_ID]
        ,[ITEM_UNIT_COST]
        ,[TIMEKEEPER_NAME]
        ,[TIMEKEEPER_CLASSIFICATION]
        ,[CLIENT_MATTER_ID]
        ,[TIMEKEEPER_EMAIL]
        ,[CreatedBy]
        ,[CreatedOn]
        ,[ModifiedBy]
        ,[ModifiedOn]
        ,[DeletedFlag]
        ,[sid]
        ,[ITEM_BASE_NUMBER_OF_UNITS]
        ,[ITEM_BASE_TOTAL]
        ,[UNIQUE_ID]
        ,[IsCancelled]
        ,[ENTRY_CLOSED_DATE]
        )
    SELECT ts.TransactionID
        ,(
            SELECT t1.ID
            FROM [dbo].[time] t1
            INNER JOIN [dbo].[TimekMaster] tm ON t1.ABATimekCode = tm.ID
                AND (
                    tm.Code = ts.ITEM_TimeK_CODE
                    OR tm.NAME = ts.ITEM_TimeK_CODE
                    )
            WHERE t1.sid = @SID
                AND t1.DeletedFlag = 0
                AND m.id = t1.matterid
            ) ID
        ,(
            SELECT r.[ID]
            FROM [dbo].[Resource] r
            WHERE r.[FirmUserID] = ts.[TIMEKEEPER_ID]
                AND r.[sid] = @SID
                AND r.[DeletedFlag] = 0
            ) ResourceID
        ,ts.INVOICE_DATE
        ,ts.INVOICE_NUMBER
        ,ts.CLIENT_ID
        ,ts.LAW_FIRM_MATTER_ID
        ,REPLACE(ts.INVOICE_TOTAL, ',', '')
        ,ts.BILLING_START_DATE
        ,ts.BILLING_END_DATE
        ,ts.INVOICE_DESCRIPTION
        ,ts.ITEM_NUMBER
        ,ts.EXP_FEE_INV_ADJ_TYPE
        ,REPLACE(ts.ITEM_NUMBER_OF_UNITS, ',', '')
        ,REPLACE(ts.ITEM_ADJUSTMENT_AMOUNT, ',', '')
        ,REPLACE(ts.ITEM_TOTAL, ',', '')
        ,ts.[ITEM_DATE] --ITEM_DATE, varchar(1000),>
        ,ts.ITEM_TimeK_CODE
        ,ts.ITEM_EXPENSE_CODE
        ,ts.ITEM_ACTIVITY_CODE
        ,ts.TIMEKEEPER_ID
        ,CAST(ts.ITEM_DESCRIPTION AS VARCHAR(4000))
        ,ts.LAW_FIRM_ID
        ,REPLACE(ts.ITEM_UNIT_COST, ',', '')
        ,ts.TIMEKEEPER_NAME
        ,ts.TIMEKEEPER_CLASSIFICATION
        ,ts.CLIENT_MATTER_ID
        ,ts.TIMEKEEPER_EMAIL
        ,0 --<CreatedBy, int,>
        ,GETDATE() --<CreatedOn, datetime2(7),>
        ,NULL --<ModifiedBy, int,>
        ,NULL --<ModifiedOn, datetime2(7),>
        ,0 --<DeletedFlag, bit,>
        ,@SID --<sid, int,>
        ,REPLACE(ts.[ITEM_NUMBER_OF_UNITS_BASE], ',', '') 
        ,REPLACE(ts.[ITEM_BASE_TOTAL], ',', '') 
        ,ts.[UNIQUE_ITEM_NUMBER] --<UNIQUE_DOCKET_ID, varchar(1000),>
        ,0 --ts.[ITEM_CANCEL] --<IsCancelled, bit,>
        ,ts.[ENTRY_CLOSED_DATE] --<ENTRY_CLOSED_DATE, varchar(100),>)
    FROM E_Staging ts
    INNER JOIN matter m ON m.firmmatternumber = ts.LAW_FIRM_MATTER_ID
        AND m.[sid] = @SID
        AND m.[DeletedFlag] = 0
    WHERE 1 = (
            SELECT COUNT(1)
            FROM [dbo].[Time] t1
            INNER JOIN [dbo].[TimekMaster] tm ON t1.ABATimekCode = tm.ID
                AND (
                    tm.Code = ts.ITEM_TimeK_CODE
                    OR tm.NAME = ts.ITEM_TimeK_CODE
                    )
            WHERE t1.sid = @SID
                AND t1.DeletedFlag = 0
                AND m.id = t1.matterid
            )
        AND ts.[Sid] = @SID
        AND ts.[TransactionID] = @TxnID
        AND NOT EXISTS (
            SELECT 1
            FROM [dbo].[Entries] te
            WHERE te.[sid] = @SID
                AND te.[DeletedFlag] = 0
                AND ts.[UNIQUE_ITEM_NUMBER] = te.[UNIQUE_ID]
            )

BASE Table

CREATE TABLE [dbo].[Entries] (
    [ID] INT IDENTITY(1, 1) NOT NULL
    ,[TransactionID] INT NOT NULL
    ,[TaskID] INT NULL
    ,[ResourceID] INT NULL
    ,[INVOICE_DATE] VARCHAR(1000) NULL
    ,[INVOICE_NUMBER] VARCHAR(1000) NULL
    ,[CLIENT_ID] VARCHAR(1000) NULL
    ,[LAW_FIRM_MATTER_ID] VARCHAR(1000) NULL
    ,[INVOICE_TOTAL] NUMERIC(14, 2) NULL
    ,[BILLING_START_DATE] VARCHAR(1000) NULL
    ,[BILLING_END_DATE] VARCHAR(1000) NULL
    ,[INVOICE_DESCRIPTION] VARCHAR(1000) NULL
    ,[LINE_ITEM_NUMBER] VARCHAR(1000) NULL
    ,[EXP_FEE_INV_ADJ_TYPE] VARCHAR(1000) NULL
    ,[LINE_ITEM_NUMBER_OF_UNITS] NUMERIC(14, 2) NULL
    ,[LINE_ITEM_ADJUSTMENT_AMOUNT] NUMERIC(14, 2) NULL
    ,[LINE_ITEM_TOTAL] NUMERIC(14, 2) NULL
    ,[LINE_ITEM_DATE] VARCHAR(1000) NULL
    ,[LINE_ITEM_TASK_CODE] VARCHAR(1000) NULL
    ,[LINE_ITEM_EXPENSE_CODE] VARCHAR(1000) NULL
    ,[LINE_ITEM_ACTIVITY_CODE] VARCHAR(1000) NULL
    ,[TIMEKEEPER_ID] VARCHAR(1000) NULL
    ,[LINE_ITEM_DESCRIPTION] VARCHAR(4000) NULL
    ,[LAW_FIRM_ID] VARCHAR(1000) NULL
    ,[LINE_ITEM_UNIT_COST] NUMERIC(14, 2) NULL
    ,[TIMEKEEPER_NAME] VARCHAR(1000) NULL
    ,[TIMEKEEPER_CLASSIFICATION] VARCHAR(1000) NULL
    ,[CLIENT_MATTER_ID] VARCHAR(1000) NULL
    ,[TIMEKEEPER_EMAIL] VARCHAR(1000) NULL
    ,[CreatedBy] INT NULL
    ,[CreatedOn] DATETIME2(7) DEFAULT(sysutcdatetime()) NULL
    ,[ModifiedBy] INT NULL
    ,[ModifiedOn] DATETIME2(7) NULL
    ,[DeletedFlag] BIT DEFAULT((0)) NULL
    ,[sid] INT NOT NULL
    ,[LINE_ITEM_BASE_NUMBER_OF_UNITS] NUMERIC(14, 2) NULL
    ,[LINE_ITEM_BASE_TOTAL] NUMERIC(14, 2) NULL
    ,[UNIQUE_ID] VARCHAR(1000) NULL
    ,[IsCancelled] BIT DEFAULT((0)) NOT NULL
    ,[ENTRY_CLOSED_DATE] VARCHAR(100) NULL
    ,[MappingStatus] TINYINT DEFAULT((0)) NOT NULL
    ,[LINE_ITEM_PHASE_CODE] VARCHAR(1000) NULL
    ,[Override] BIT DEFAULT((0)) NOT NULL
    ,[e_INVOICE_DESCRIPTION] VARCHAR(1000) NULL
    ,[e_CLIENT_ID] VARBINARY(8000) NULL
    ,[e_LINE_ITEM_DESCRIPTION] VARCHAR(8000) NULL
    ,[e_CLIENT_MATTER_ID] VARCHAR(1000) NULL
    ,[AmountDocketed] NUMERIC(14, 2) DEFAULT((0)) NOT NULL
    ,[AmountBilled] NUMERIC(14, 2) DEFAULT((0)) NOT NULL
    ,[AmountWrittenDown] NUMERIC(14, 2) DEFAULT((0)) NOT NULL
    ,[AmountWrittenUp] NUMERIC(14, 2) DEFAULT((0)) NOT NULL
    ,[AmountWIP] NUMERIC(14, 2) DEFAULT((0)) NOT NULL
    ,[HourDocketed] NUMERIC(14, 2) DEFAULT((0)) NOT NULL
    ,[HourBilled] NUMERIC(14, 2) DEFAULT((0)) NOT NULL
    ,[HourWrittenDown] NUMERIC(14, 2) DEFAULT((0)) NOT NULL
    ,[HourWrittenUp] NUMERIC(14, 2) DEFAULT((0)) NOT NULL
    ,[HourWIP] NUMERIC(14, 2) DEFAULT((0)) NOT NULL
    ,[MappedBy] INT NULL DEFAULT((0))
    ,[MappedOn] DATETIME2 NULL DEFAULT(sysutcdatetime())
    ,[RemapBy] INT NULL
    ,CONSTRAINT [PK_utb_entries] PRIMARY KEY CLUSTERED ([ID] ASC)
    ,FOREIGN KEY ([TaskID]) REFERENCES [dbo].[Task]([ID])
    ,FOREIGN KEY ([TaskID]) REFERENCES [dbo].[Task]([ID])
    );
GO

CREATE NONCLUSTERED INDEX [Entries_TaskID] ON [dbo].[Entries] ([TaskID] ASC);
GO

CREATE NONCLUSTERED INDEX [Entries_LAW_FIRM_MATTER_ID] ON [dbo].[Entries] ([LAW_FIRM_MATTER_ID] ASC);
GO

CREATE NONCLUSTERED INDEX [Entries_UNIQUE_DOCKET_ID] ON [dbo].[Entries] ([UNIQUE_DOCKET_ID] ASC);
GO

CREATE NONCLUSTERED INDEX [Entries_ResourceID] ON [dbo].[Entries] ([ResourceID] ASC);
GO

CREATE NONCLUSTERED INDEX [Entries_INVOICE_NUMBER] ON [dbo].[Entries] ([INVOICE_NUMBER] ASC);
GO

CREATE NONCLUSTERED INDEX [Entries_EXP_FEE_INV_ADJ_TYPE] ON [dbo].[Entries] ([EXP_FEE_INV_ADJ_TYPE] ASC);
GO

CREATE NONCLUSTERED INDEX [Entries_IsCancelled] ON [dbo].[Entries] ([IsCancelled] ASC);
GO

CREATE NONCLUSTERED INDEX [IX_Entries_Grouped] ON [dbo].[Entries] (
    [TaskID] ASC
    ,[sid] ASC
    ,[DeletedFlag] ASC
    ,[ID] ASC
    ) INCLUDE (
    [INVOICE_NUMBER]
    ,[LAW_FIRM_MATTER_ID]
    ,[LINE_ITEM_NUMBER_OF_UNITS]
    ,[LINE_ITEM_TOTAL]
    ,[LINE_ITEM_BASE_NUMBER_OF_UNITS]
    ,[LINE_ITEM_BASE_TOTAL]
    ,[IsCancelled]
    )
    WITH (
            SORT_IN_TEMPDB = OFF
            ,DROP_EXISTING = OFF
            ,ONLINE = OFF
            ) ON [PRIMARY]
GO

CREATE NONCLUSTERED INDEX [IX_Entries_Grouped1] ON [dbo].[Entries] (
    [sid] ASC
    ,[DeletedFlag] ASC
    ,[IsCancelled] ASC
    ) INCLUDE (
    [ID]
    ,[LAW_FIRM_MATTER_ID]
    ,[EXP_FEE_INV_ADJ_TYPE]
    ,[LINE_ITEM_BASE_NUMBER_OF_UNITS]
    ,[LINE_ITEM_BASE_TOTAL]
    )
    WITH (
            SORT_IN_TEMPDB = OFF
            ,DROP_EXISTING = OFF
            ,ONLINE = OFF
            ) ON [PRIMARY]
GO

CREATE NONCLUSTERED INDEX [IX_Entries_Grouped2] ON [dbo].[Entries] (
    [sid] ASC
    ,[DeletedFlag] ASC
    ,[ID] ASC
    ,[LINE_ITEM_BASE_TOTAL] ASC
    ,[TaskID] ASC
    ) INCLUDE ([LAW_FIRM_MATTER_ID])
    WITH (
            SORT_IN_TEMPDB = OFF
            ,DROP_EXISTING = OFF
            ,ONLINE = OFF
            ) ON [PRIMARY]
GO

CREATE NONCLUSTERED INDEX [IX_Entries_Grouped3] ON [dbo].[Entries] (
    [TaskID] ASC
    ,[sid] ASC
    ,[DeletedFlag] ASC
    ,[ID] ASC
    ,[ResourceID] ASC
    ) INCLUDE (
    [INVOICE_NUMBER]
    ,[EXP_FEE_INV_ADJ_TYPE]
    ,[LINE_ITEM_NUMBER_OF_UNITS]
    ,[LINE_ITEM_TOTAL]
    ,[LINE_ITEM_DATE]
    ,[LINE_ITEM_BASE_NUMBER_OF_UNITS]
    ,[LINE_ITEM_BASE_TOTAL]
    ,[IsCancelled]
    ,[ENTRY_CLOSED_DATE]
    )
    WITH (
            SORT_IN_TEMPDB = OFF
            ,DROP_EXISTING = OFF
            ,ONLINE = OFF
            ) ON [PRIMARY]
GO

CREATE NONCLUSTERED INDEX [IX_Entries_Grouped4] ON [dbo].[Entries] (
    [TaskID] ASC
    ,[sid] ASC
    ,[IsCancelled] ASC
    ,[DeletedFlag] ASC
    ,[ID] ASC
    ,[ResourceID] ASC
    ) INCLUDE (
    [INVOICE_NUMBER]
    ,[EXP_FEE_INV_ADJ_TYPE]
    ,[LINE_ITEM_NUMBER_OF_UNITS]
    ,[LINE_ITEM_TOTAL]
    ,[LINE_ITEM_DATE]
    ,[LINE_ITEM_BASE_NUMBER_OF_UNITS]
    ,[LINE_ITEM_BASE_TOTAL]
    ,[ENTRY_CLOSED_DATE]
    )
    WITH (
            SORT_IN_TEMPDB = OFF
            ,DROP_EXISTING = OFF
            ,ONLINE = OFF
            ) ON [PRIMARY]
GO

CREATE NONCLUSTERED INDEX [Entries_LINE_ITEM_DATE] ON [dbo].[Entries] ([LINE_ITEM_DATE])
GO

CREATE STATISTICS [Stat_Entries_1] ON [dbo].[Entries] (
    [IsCancelled]
    ,[ID]
    ,[TaskID]
    ,[sid]
    )
GO

staging table

CREATE TABLE [dbo].[E_Staging] (
    [ID] INT IDENTITY(1, 1) NOT NULL
    ,[INVOICE_DAE] VARCHAR(1000) NULL
    ,[INVOICE_NUMBER] VARCHAR(1000) NULL
    ,[CLIENT_ID] VARCHAR(1000) NULL
    ,[LAW_FIRM_MATER_ID] VARCHAR(1000) NULL
    ,[INVOICE_TOTAL] VARCHAR(1000) NULL
    ,[BILLING_START_DAE] VARCHAR(1000) NULL
    ,[BILLING_END_DAE] VARCHAR(1000) NULL
    ,[INVOICE_DESCRIPTION] VARCHAR(8000) NULL
    ,[IEM_NUMBER] VARCHAR(1000) NULL
    ,[EXP_FEE_INV_ADJ_TYPE] VARCHAR(1000) NULL
    ,[IEM_NUMBER_OF_UNITS] VARCHAR(1000) NULL
    ,[IEM_NUMBER_OF_UNITS_BASE] VARCHAR(1000) NULL
    ,[IEM_ADJUSTMENT_AMOUNT] VARCHAR(1000) NULL
    ,[IEM_TOTAL] VARCHAR(1000) NULL
    ,[IEM_BASE_TOTAL] VARCHAR(1000) NULL
    ,[IEM_DAE] VARCHAR(100) NULL
    ,[IEM_TASK_CODE] VARCHAR(1000) NULL
    ,[IEM_EXPENSE_CODE] VARCHAR(1000) NULL
    ,[IEM_ACTIVITY_CODE] VARCHAR(1000) NULL
    ,[TIMEKEEPER_ID] VARCHAR(1000) NULL
    ,[IEM_DESCRIPTION] VARCHAR(8000) NULL
    ,[LAW_FIRM_ID] VARCHAR(1000) NULL
    ,[IEM_UNIT_COST] VARCHAR(1000) NULL
    ,[TIMEKEEPER_NAME] VARCHAR(1000) NULL
    ,[TIMEKEEPER_CLASSIFICATION] VARCHAR(1000) NULL
    ,[CLIENT_MATER_ID] VARCHAR(1000) NULL
    ,[TIMEKEEPER_EMAIL] VARCHAR(1000) NULL
    ,[UNIQUE_ITEM_NUMBER] VARCHAR(1000) NULL
    ,[IEM_CANCEL] VARCHAR(1000) NULL
    ,[Sid] INT NULL
    ,[TransactionID] INT NULL
    ,[ValidationStatus] VARCHAR(20) NULL
    ,[ValidationMessage] VARCHAR(200) NULL
    ,[UploadStatus] VARCHAR(20) NULL
    ,[StatusMessage] VARCHAR(50) NULL
    ,[CREATEdOn] DAETIME NULL DEFAULT(sysutcdaEtime())
    ,[FirmLastModifiedon] VARCHAR(1000) NULL
    ,[ENTRY_CLOSED_DAE] VARCHAR(1000) NULL
    ,[UNIQUE_Docket_ID2] VARCHAR(1000) NULL
    ,[DockeEd_Released_Billed] VARCHAR(1000) NULL
    ,[IEM_PHASE_CODE] VARCHAR(1000) NULL
    ,[WriE_Down] VARCHAR(200) NULL
    ,[e_ClientId] VARBINARY(MAX) NULL
    ,[e_INVOICE_DESCRIPTION] VARCHAR(1000) NULL
    ,[e_IEM_DESCRIPTION] VARCHAR(8000) NULL
    ,[e_CLIENT_MATER_ID] VARCHAR(1000) NULL
    ,CONSTRAINT [PK_E_Staging] PRIMARY KEY CLUSERED([ID] ASC)
    );
GO

CREATE NONCLUSERED INDEX [nci_wi_E_Staging_2150E36DEE003CE53D44FC548A4CD6CC] ON [dbo].[E_Staging] (
    [Sid] ASC
    ,[TransactionID] ASC
    )
GO

CREATE NONCLUSERED INDEX IX_E_Staging_CREATEDON ON [dbo].[E_Staging] (CREATEDON)
GO

CREATE NONCLUSERED INDEX IX_E_Staging_LAW_FIRM_MATER_ID ON [dbo].[E_Staging] (LAW_FIRM_MATER_ID)
GO

CREATE NONCLUSERED INDEX IX_E_Staging_UNIQUE_IEM_NUMBER ON [dbo].[E_Staging] (UNIQUE_IEM_NUMBER)
GO
like image 668
Shaitender Singh Avatar asked Dec 23 '22 01:12

Shaitender Singh


2 Answers

I can suggest that instead of creating new indexes on target table, ideally you should be disabling indexes before insertion.

  1. Disable indexes and constraints on the target table.
  2. Insert the records.
  3. Enable indexes and constraints.

This should give you better performance for insertion.

Apart from this, you need to optimize your select query. You can get more help if you share the select query also.

like image 192
PSK Avatar answered Jan 17 '23 15:01

PSK


Better than using stored procedures, I would recommend you to use the SQL Server import assistant:

  • On the database, right click + Import
  • Select the database, then select the table
  • Selecting the table, can click it and click on "Enable Mappings"
  • Click on "Delete rows in destination table" if you want to remove previous data
  • Very important "Enable identity insert" to allow to copy the IDENTITY columns across the databases
  • Follow with the default options to the end of the Wizard

From my experience, this is in general by far much quicker than using specific manual solutions.

like image 37
Angel M. Avatar answered Jan 17 '23 16:01

Angel M.