Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

how to insert data parellel in three different tables

I have a stored procedure which will insert the bulk of records, now is there any possibility to insert data into 3 tables in parallel;

  • First table inserting 1 million records.
  • Second table inserting 1.5 million records.
  • Third table inserting 500k records

As per my knowledge - procedure insertion is happening one after other.

So how can I implement the loading in parallel?

like image 663
katraj Avatar asked Sep 22 '16 06:09

katraj


People also ask

How do I insert data into multiple tables at once?

No, you can't insert into multiple tables in one MySQL command. You can however use transactions. BEGIN; INSERT INTO users (username, password) VALUES('test', 'test'); INSERT INTO profiles (userid, bio, homepage) VALUES(LAST_INSERT_ID(),'Hello world!

Can you split table into three tables?

On the LAYOUT tab, in the Merge group, click Split Table. The table splits into two tables. You can split the table further, as long as there are multiple rows.

How do I insert multiple rows from one table to another in SQL?

The INSERT statement also allows you to insert multiple rows into a table using a single statement as the following: INSERT INTO table_name(column1,column2…) VALUES (value1,value2,…), (value1,value2,…), … In this form, you need to provide multiple lists of values, each list is separated by a comma.

What are the three ways to work with multiple tables in the same query?

Three Main Ways to Combine ResultsJOIN – You can use joins to combine columns from one or more queries into one result. UNION – Use Unions and other set operators to combine rows from one or more queries into one result. Sub Queries – I sometimes call these nested queries.


Video Answer


3 Answers

Statements execute synchronously within a T-SQL batch. To execute multiple statements asynchronously and in parallel from a stored procedure, you'll need to use multiple concurrent database connections. Note that the tricky part with asynchronous execution is determining not only when all the tasks have completed, but also whether they have succeeded or failed.

Method 1: SSIS package

Create an SSIS package to execute the 3 SQL statements in parallel. In SQL 2012 and later, run the package using SSIS catalog stored procedures. Pre-SQL 2012, you'll need to create a SQL Agent job for the package and launch with sp_start_job.

You'll need to check the SSIS execution status or SQL Agent job status to determine completion, and success/failure result.

Method 2: Powershell and SQL Agent

Execute a SQL Agent job that runs a Powershell script that executes the queries in parallel using Powershell background jobs (Start-Job command). The script can return an exit code, zero for success and non-zero for failure, so that SQL Agent can determine if it succeeded. Check SQL Agent job status to determine completion, and success/failure result.

Method 3: Multiple SQL Agent jobs

Execute multiple SQL Agent jobs concurrently, each with a T-SQL job step containting the import script. Check SQL Agent job status of each job to determine completion, and success/failure result.

Method 4: Service Broker Use a queue activated proc to execute the import scripts in parallel. This can be obtuse if you haven't used Service broker before and it is important to follow vetted patterns. I've included an example to get you started (replace THROW with RAISERROR for pre-SQL 2012). The database must have Service Broker enabled, which is enabled by default but turned off following a restore or attach.

USE YourDatabase;
Go

--create proc that will be automatically executed (activated) when requests are waiting
CREATE PROC dbo.ExecuteTSqlTask
AS
SET NOCOUNT ON;

DECLARE
      @TSqlJobConversationHandle uniqueidentifier = NEWID()
    , @TSqlExecutionRequestMessage xml
    , @TSqlExecutionResultMessage xml
    , @TSqlExecutionResult varchar(10)
    , @TSqlExecutionResultDetails nvarchar(MAX)
    , @TSqlScript nvarchar(MAX)
    , @TSqlTaskName sysname
    , @RowsAffected int
    , @message_type_name sysname;

WHILE 1 = 1
BEGIN

    --get the next task to execute
    WAITFOR (
        RECEIVE TOP (1)
              @TSqlJobConversationHandle = conversation_handle
            , @TSqlExecutionRequestMessage = CAST(message_body AS xml)
            , @message_type_name = message_type_name
        FROM dbo.TSqlExecutionQueue
        ), TIMEOUT 1000;

    IF @@ROWCOUNT = 0
    BEGIN
        --no work to do - exit
        BREAK;
    END;

    IF @message_type_name = N'TSqlExecutionRequest'
    BEGIN

        --get task name and script
        SELECT
              @TSqlTaskName = @TSqlExecutionRequestMessage.value('(/TSqlTaskName)[1]', 'sysname')
            , @TSqlScript = @TSqlExecutionRequestMessage.value('(/TSqlScript)[1]', 'nvarchar(MAX)');

        --execute script
        BEGIN TRY
            EXEC sp_executesql @TSqlScript;
            SET @RowsAffected = @@ROWCOUNT;
            SET @TSqlExecutionResult = 'Completed';
            SET @TSqlExecutionResultDetails = CAST(@RowsAffected as varchar(10)) + ' rows affected';
        END TRY
        BEGIN CATCH
            SET @TSqlExecutionResult = 'Erred';
            SET @TSqlExecutionResultDetails = 
                  'Msg ' + CAST(ERROR_NUMBER() AS varchar(10))
                + ', Level ' + CAST(ERROR_SEVERITY() AS varchar(2))
                + ', State ' + CAST(ERROR_STATE() AS varchar(10))
                + ', Line ' + CAST(ERROR_LINE() AS varchar(10))
                + ': ' + ERROR_MESSAGE();
        END CATCH;

        --send execution result back to initiator
        SET @TSqlExecutionResultMessage = '<TSqlTaskName /><TSqlExecutionResult /><TSqlExecutionResultDetails />';
        SET @TSqlExecutionResultMessage.modify('insert text {sql:variable("@TSqlTaskName")} into (/TSqlTaskName)[1] ');
        SET @TSqlExecutionResultMessage.modify('insert text {sql:variable("@TSqlExecutionResult")} into (/TSqlExecutionResult)[1] ');
        SET @TSqlExecutionResultMessage.modify('insert text {sql:variable("@TSqlExecutionResultDetails")} into (/TSqlExecutionResultDetails)[1] ');
        SEND ON CONVERSATION @TSqlJobConversationHandle
            MESSAGE TYPE TSqlExecutionResult
            (@TSqlExecutionResultMessage);

    END
    ELSE
    BEGIN
        IF @message_type_name = N'TSqlJobComplete'
        BEGIN
            --service has ended conversation so we're not going to get any more execution requests
            END CONVERSATION @TSqlJobConversationHandle;
        END
        ELSE
        BEGIN
            END CONVERSATION @TSqlJobConversationHandle WITH ERROR = 1 DESCRIPTION = 'Unexpected message type received by ExecuteTSqlTask';
            RAISERROR('Unexpected message type received (%s) by ExecuteTSqlTask', 16, 1, @message_type_name);
        END;
    END;
END;
GO

CREATE QUEUE dbo.TSqlResultQueue;
CREATE QUEUE dbo.TSqlExecutionQueue
    WITH STATUS=ON,
    ACTIVATION (
          STATUS = ON
        , PROCEDURE_NAME = dbo.ExecuteTSqlTask
        , MAX_QUEUE_READERS = 3 --max number of concurrent activated proc instances 
        , EXECUTE AS OWNER
        );
CREATE MESSAGE TYPE TSqlExecutionRequest VALIDATION = WELL_FORMED_XML;
CREATE MESSAGE TYPE TSqlExecutionResult VALIDATION = WELL_FORMED_XML;
CREATE MESSAGE TYPE TSqlJobComplete VALIDATION = WELL_FORMED_XML;
CREATE CONTRACT TSqlExecutionContract (
      TSqlExecutionRequest SENT BY INITIATOR
    , TSqlJobComplete SENT BY INITIATOR
    , TSqlExecutionResult SENT BY TARGET
    );
CREATE SERVICE TSqlJobService ON QUEUE dbo.TSqlResultQueue ([TSqlExecutionContract]);
CREATE SERVICE TSqlExecutorService ON QUEUE dbo.TSqlExecutionQueue ([TSqlExecutionContract]);
GO

CREATE PROC dbo.ExecuteParallelImportScripts
AS
SET NOCOUNT ON;

DECLARE
      @TSqlJobConversationHandle uniqueidentifier
    , @TSqlExecutionRequestMessage xml
    , @TSqlExecutionResultMessage xml
    , @TSqlExecutionResult varchar(10)
    , @TSqlExecutionResultDetails nvarchar(MAX)
    , @TSqlTaskName sysname
    , @CompletedCount int = 0
    , @ErredCount int = 0
    , @message_type_name sysname;

DECLARE @TsqlTask TABLE(
      TSqlTaskName sysname NOT NULL PRIMARY KEY 
    , TSqlScript nvarchar(MAX) NOT NULL
    );

BEGIN TRY

    --insert a row for each import task
    INSERT INTO @TsqlTask(TSqlTaskName, TSqlScript) 
        VALUES(N'ImportScript1', N'INSERT INTO dbo.Table1 SELECT * FROM dbo.Table1Staging;');
    INSERT INTO @TsqlTask(TSqlTaskName, TSqlScript) 
        VALUES(N'ImportScript2', N'INSERT INTO dbo.Table2 SELECT * FROM dbo.Table2Staging;');
    INSERT INTO @TsqlTask(TSqlTaskName, TSqlScript) 
        VALUES(N'ImportScript3', N'INSERT INTO dbo.Table3 SELECT * FROM dbo.Table3Staging;');

    --start a conversation for this import process
    BEGIN DIALOG CONVERSATION @TsqlJobConversationHandle
        FROM SERVICE TSqlJobService
        TO SERVICE 'TSqlExecutorService', 'CURRENT DATABASE'
        ON CONTRACT TSqlExecutionContract
        WITH ENCRYPTION = OFF;

    --send import tasks to executor service for parallel execution
    DECLARE JobTasks CURSOR LOCAL FAST_FORWARD FOR
        SELECT (SELECT TSqlTaskName, TSqlScript
            FROM @TsqlTask AS task 
            WHERE task.TSqlTaskName = job.TSqlTaskName
            FOR XML PATH(''), TYPE) AS TSqlExecutionRequest
        FROM @TsqlTask AS job;
    OPEN JobTasks;
    WHILE 1 = 1
    BEGIN
        FETCH NEXT FROM JobTasks INTO @TSqlExecutionRequestMessage;
        IF @@FETCH_STATUS = -1 BREAK;
        SEND ON CONVERSATION @TSqlJobConversationHandle
            MESSAGE TYPE TSqlExecutionRequest
            (@TSqlExecutionRequestMessage);
    END;
    CLOSE JobTasks;
    DEALLOCATE JobTasks;

    --get each parallel task execution result until all are complete
    WHILE 1 = 1
    BEGIN

        --get next task result
        WAITFOR (
            RECEIVE TOP (1)
                  @TSqlExecutionResultMessage = CAST(message_body AS xml)
                , @message_type_name = message_type_name
            FROM dbo.TSqlResultQueue
            WHERE conversation_handle = @TSqlJobConversationHandle
            ), TIMEOUT 1000;

        IF @@ROWCOUNT <> 0
        BEGIN

            IF @message_type_name = N'TSqlExecutionResult'
            BEGIN

                --get result of import script execution
                SELECT
                      @TSqlTaskName = @TSqlExecutionResultMessage.value('(/TSqlTaskName)[1]', 'sysname')
                    , @TSqlExecutionResult = @TSqlExecutionResultMessage.value('(/TSqlExecutionResult)[1]', 'varchar(10)')
                    , @TSqlExecutionResultDetails = COALESCE(@TSqlExecutionResultMessage.value('(/TSqlExecutionResultDetails)[1]', 'nvarchar(MAX)'), N'');
                RAISERROR('Import task %s %s: %s', 0, 0, @TSqlTaskName, @TSqlExecutionResult, @TSqlExecutionResultDetails) WITH NOWAIT;
                IF @TSqlExecutionResult = 'Completed'
                BEGIN
                    SET @CompletedCount += 1;
                END
                ELSE
                BEGIN
                    SET @ErredCount += 1;
                END;

                --remove task from tracking table after completion
                DELETE FROM @TSqlTask
                WHERE TSqlTaskName = @TSqlTaskName;

                IF NOT EXISTS(SELECT 1 FROM @TsqlTask)
                BEGIN
                    --all tasks are done - send TSqlJobComplete message to instruct executor service to end conversation
                    SEND ON CONVERSATION @TSqlJobConversationHandle
                        MESSAGE TYPE TSqlJobComplete;
                END
            END
            ELSE
            BEGIN
                IF @message_type_name = N'http://schemas.microsoft.com/SQL/ServiceBroker/EndDialog'
                BEGIN
                    --executor service has ended conversation so we're done
                    END CONVERSATION @TSqlJobConversationHandle;
                    BREAK;
                END
                ELSE
                BEGIN
                    END CONVERSATION @TSqlJobConversationHandle WITH ERROR = 1 DESCRIPTION = 'Unexpected message type received by ExecuteParallelInserts';
                    RAISERROR('Unexpected message type received (%s) by ExecuteParallelInserts', 16, 1, @message_type_name);
                END;
            END
        END;
    END;
    RAISERROR('Import processing completed. CompletedCount=%d, ErredCount=%d.', 0, 0, @CompletedCount, @ErredCount);
END TRY
BEGIN CATCH
    THROW;
END CATCH;
GO

--execute import scripts in parallel
EXEC dbo.ExecuteParallelImportScripts;
GO
like image 163
Dan Guzman Avatar answered Oct 12 '22 22:10

Dan Guzman


You can try creating three jobs and execute insert scripts in parallel as below :

DECLARE @jobId BINARY(16)
EXEC msdb.dbo.sp_add_job @job_name=N'Job1', 
        @enabled=1,  
        @description=N'No description available.', 
        @job_id = @jobId OUTPUT

EXEC msdb.dbo.sp_add_jobstep @job_id=@jobId, @step_name=N'Insert into First Table', 
        @step_id=1, 
        @cmdexec_success_code=0, 
        @on_success_action=1, 
        @on_success_step_id=0, 
        @on_fail_action=2, 
        @on_fail_step_id=0, 
        @retry_attempts=0, 
        @retry_interval=0, 
        @os_run_priority=0, @subsystem=N'TSQL', 
        @command=N'--Insert script for first table', 
        @database_name=N'Test', 
        @flags=0

EXEC msdb.dbo.sp_add_jobserver @job_id = @jobId, @server_name = N'(local)'
GO

DECLARE @jobId BINARY(16)
EXEC msdb.dbo.sp_add_job @job_name=N'Job2', 
        @enabled=1,  
        @description=N'No description available.', 
        @job_id = @jobId OUTPUT

EXEC msdb.dbo.sp_add_jobstep @job_id=@jobId, @step_name=N'Insert into second Table', 
        @step_id=1, 
        @cmdexec_success_code=0, 
        @on_success_action=1, 
        @on_success_step_id=0, 
        @on_fail_action=2, 
        @on_fail_step_id=0, 
        @retry_attempts=0, 
        @retry_interval=0, 
        @os_run_priority=0, @subsystem=N'TSQL', 
        @command=N'--Insert script for second table', 
        @database_name=N'Test', 
        @flags=0

EXEC msdb.dbo.sp_add_jobserver @job_id = @jobId, @server_name = N'(local)'
GO

DECLARE @jobId BINARY(16)
EXEC msdb.dbo.sp_add_job @job_name=N'Job3', 
        @enabled=1,  
        @description=N'No description available.', 
        @job_id = @jobId OUTPUT

EXEC msdb.dbo.sp_add_jobstep @job_id=@jobId, @step_name=N'Insert into Third Table', 
        @step_id=1, 
        @cmdexec_success_code=0, 
        @on_success_action=1, 
        @on_success_step_id=0, 
        @on_fail_action=2, 
        @on_fail_step_id=0, 
        @retry_attempts=0, 
        @retry_interval=0, 
        @os_run_priority=0, @subsystem=N'TSQL', 
        @command=N'--Insert script for third table', 
        @database_name=N'Test', 
        @flags=0

EXEC msdb.dbo.sp_add_jobserver @job_id = @jobId, @server_name = N'(local)'
GO

EXEC msdb.dbo.sp_start_job N'Job1' ; --All will execute in parallel
EXEC msdb.dbo.sp_start_job N'Job2' ;
EXEC msdb.dbo.sp_start_job N'Job3' ;
like image 26
Kannan Kandasamy Avatar answered Oct 12 '22 23:10

Kannan Kandasamy


Assuming that you would like to have the same insert date value for all inserts, define a date parameter set to the current date as shown.

DECLARE @InsertDate as date
SET @InsertDate = GetDate()

Then, pass the insert date parameter into your insert stored procedure and update this stored procedure accordingly to use that input. This will ensure the the same insert date value will be used for all inserts.

EXEC dbo.InsertTables123 @p1 = @InsertDate

The @InsertDate input parameter can also be manually assigned if something other than the current date if needed.

like image 2
JohnH Avatar answered Oct 12 '22 23:10

JohnH