Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why is inserting into and joining #temp tables faster?

I have a query that looks like

SELECT
 P.Column1,
 P.Column2,
 P.Column3,
 ...
 (
   SELECT
       A.ColumnX,
       A.ColumnY,
       ...
   FROM
      dbo.TableReturningFunc1(@StaticParam1, @StaticParam2) AS A
   WHERE
      A.Key = P.Key
   FOR XML AUTO, TYPE  
 ),
 (
   SELECT
       B.ColumnX,
       B.ColumnY,
       ...
   FROM
      dbo.TableReturningFunc2(@StaticParam1, @StaticParam2) AS B
   WHERE
      B.Key = P.Key
   FOR XML AUTO, TYPE  
 )
FROM
(
   <joined tables here>
) AS P
FOR XML AUTO,ROOT('ROOT') 

P has ~ 5000 rows A and B ~ 4000 rows each

This query has a runtime performance of ~10+ minutes.

Changing it to this however:

SELECT
 P.Column1,
 P.Column2,
 P.Column3,
 ...
INTO #P

SELECT
 A.ColumnX,
 A.ColumnY,
 ...
INTO #A     
FROM
 dbo.TableReturningFunc1(@StaticParam1, @StaticParam2) AS A

SELECT
 B.ColumnX,
 B.ColumnY,
 ...
INTO #B     
FROM
 dbo.TableReturningFunc2(@StaticParam1, @StaticParam2) AS B


SELECT
 P.Column1,
 P.Column2,
 P.Column3,
 ...
 (
   SELECT
       A.ColumnX,
       A.ColumnY,
       ...
   FROM
      #A AS A
   WHERE
      A.Key = P.Key
   FOR XML AUTO, TYPE  
 ),
 (
   SELECT
       B.ColumnX,
       B.ColumnY,
       ...
   FROM
      #B AS B
   WHERE
      B.Key = P.Key
   FOR XML AUTO, TYPE  
 )
FROM #P AS P
FOR XML AUTO,ROOT('ROOT')      

Has a performance of ~4 seconds.

This makes not a lot of sense, as it would seem the cost to insert into a temp table and then do the join should be higher by default. My inclination is that SQL is doing the wrong type of "join" with the subquery, but maybe I've missed it, there's no way to specify the join type to use with correlated subqueries.

Is there a way to achieve this without using #temp tables/@table variables via indexes and/or hints?

EDIT: Note that dbo.TableReturningFunc1 and dbo.TableReturningFunc2 are inline TVF's, not multi-statement, or they are "parameterized" view statements.

like image 301
Joseph Kingry Avatar asked May 28 '09 16:05

Joseph Kingry


People also ask

Can we use insert with join?

Example 5: INSERT INTO SELECT statement with Join clause to get data from multiple tables. We can use a JOIN clause to get data from multiple tables. These tables are joined with conditions specified with the ON clause. Suppose we want to get data from multiple tables and insert into a table.

What is the use of insert into?

The INSERT INTO statement is used to insert new records in a table.

What is the difference between insert into and select into?

INSERT INTO SELECT vs SELECT INTO: Both the statements could be used to copy data from one table to another. But INSERT INTO SELECT could be used only if the target table exists whereas SELECT INTO statement could be used even if the target table doesn't exist as it creates the target table if it doesn't exist.


2 Answers

Your procedures are being reevaluated for each row in P.

What you do with the temp tables is in fact caching the resultset generated by the stored procedures, thus removing the need to reevaluate.

Inserting into a temp table is fast because it does not generate redo / rollback.

Joins are also fast, since having a stable resultset allows possibility to create a temporary index with an Eager Spool or a Worktable

You can reuse the procedures without temp tables, using CTE's, but for this to be efficient, SQL Server needs to materialize the results of CTE.

You may try to force it do this with using an ORDER BY inside a subquery:

WITH    f1 AS
        (
        SELECT  TOP 1000000000
                A.ColumnX,
                A.ColumnY
        FROM    dbo.TableReturningFunc1(@StaticParam1, @StaticParam2) AS A
        ORDER BY
                A.key
        ),
        f2 AS
        (
        SELECT  TOP 1000000000
                B.ColumnX,
                B.ColumnY,
        FROM    dbo.TableReturningFunc2(@StaticParam1, @StaticParam2) AS B  
        ORDER BY
                B.Key
        )
SELECT  …

, which may result in Eager Spool generated by the optimizer.

However, this is far from being guaranteed.

The guaranteed way is to add an OPTION (USE PLAN) to your query and wrap the correspondind CTE into the Spool clause.

See this entry in my blog on how to do that:

  • Generating XML in subqueries

This is hard to maintain, since you will need to rewrite your plan each time you rewrite the query, but this works well and is quite efficient.

Using the temp tables will be much easier, though.

like image 89
Quassnoi Avatar answered Oct 14 '22 14:10

Quassnoi


This answer needs to be read together with Quassnoi's article
http://explainextended.com/2009/05/28/generating-xml-in-subqueries/

With judicious application of CROSS APPLY, you can force the caching or shortcut evaluation of inline TVFs. This query returns instantaneously.

SELECT  *
FROM    (
        SELECT  (
                SELECT  f.num
                FOR XML PATH('fo'), ELEMENTS ABSENT
                ) AS x
        FROM    [20090528_tvf].t_integer i
        cross apply (
            select num
            from [20090528_tvf].fn_num(9990) f
            where f.num = i.num
            ) f
) q
--WHERE   x IS NOT NULL -- covered by using CROSS apply
FOR XML AUTO

You haven't provided real structures so it's hard to construct something meaningful, but the technique should apply as well.

If you change the multi-statement TVF in Quassnoi's article to an inline TVF, the plan becomes even faster (at least one order of magnitude) and the plan magically reduces to something I cannot understand (it's too simple!).

CREATE FUNCTION [20090528_tvf].fn_num(@maxval INT)  
RETURNS TABLE
AS RETURN 
        SELECT  num + @maxval   num
        FROM    t_integer  

Statistics

SQL Server parse and compile time: 
   CPU time = 0 ms, elapsed time = 0 ms.

(10 row(s) affected)
Table 't_integer'. Scan count 2, logical reads 22, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

 SQL Server Execution Times:
   CPU time = 0 ms,  elapsed time = 2 ms.
like image 38
RichardTheKiwi Avatar answered Oct 14 '22 14:10

RichardTheKiwi