I have a query that looks like
SELECT
P.Column1,
P.Column2,
P.Column3,
...
(
SELECT
A.ColumnX,
A.ColumnY,
...
FROM
dbo.TableReturningFunc1(@StaticParam1, @StaticParam2) AS A
WHERE
A.Key = P.Key
FOR XML AUTO, TYPE
),
(
SELECT
B.ColumnX,
B.ColumnY,
...
FROM
dbo.TableReturningFunc2(@StaticParam1, @StaticParam2) AS B
WHERE
B.Key = P.Key
FOR XML AUTO, TYPE
)
FROM
(
<joined tables here>
) AS P
FOR XML AUTO,ROOT('ROOT')
P has ~ 5000 rows A and B ~ 4000 rows each
This query has a runtime performance of ~10+ minutes.
Changing it to this however:
SELECT
P.Column1,
P.Column2,
P.Column3,
...
INTO #P
SELECT
A.ColumnX,
A.ColumnY,
...
INTO #A
FROM
dbo.TableReturningFunc1(@StaticParam1, @StaticParam2) AS A
SELECT
B.ColumnX,
B.ColumnY,
...
INTO #B
FROM
dbo.TableReturningFunc2(@StaticParam1, @StaticParam2) AS B
SELECT
P.Column1,
P.Column2,
P.Column3,
...
(
SELECT
A.ColumnX,
A.ColumnY,
...
FROM
#A AS A
WHERE
A.Key = P.Key
FOR XML AUTO, TYPE
),
(
SELECT
B.ColumnX,
B.ColumnY,
...
FROM
#B AS B
WHERE
B.Key = P.Key
FOR XML AUTO, TYPE
)
FROM #P AS P
FOR XML AUTO,ROOT('ROOT')
Has a performance of ~4 seconds.
This makes not a lot of sense, as it would seem the cost to insert into a temp table and then do the join should be higher by default. My inclination is that SQL is doing the wrong type of "join" with the subquery, but maybe I've missed it, there's no way to specify the join type to use with correlated subqueries.
Is there a way to achieve this without using #temp tables/@table variables via indexes and/or hints?
EDIT: Note that dbo.TableReturningFunc1 and dbo.TableReturningFunc2 are inline TVF's, not multi-statement, or they are "parameterized" view statements.
Example 5: INSERT INTO SELECT statement with Join clause to get data from multiple tables. We can use a JOIN clause to get data from multiple tables. These tables are joined with conditions specified with the ON clause. Suppose we want to get data from multiple tables and insert into a table.
The INSERT INTO statement is used to insert new records in a table.
INSERT INTO SELECT vs SELECT INTO: Both the statements could be used to copy data from one table to another. But INSERT INTO SELECT could be used only if the target table exists whereas SELECT INTO statement could be used even if the target table doesn't exist as it creates the target table if it doesn't exist.
Your procedures are being reevaluated for each row in P
.
What you do with the temp tables is in fact caching the resultset generated by the stored procedures, thus removing the need to reevaluate.
Inserting into a temp table is fast because it does not generate redo
/ rollback
.
Joins are also fast, since having a stable resultset allows possibility to create a temporary index with an Eager Spool
or a Worktable
You can reuse the procedures without temp tables, using CTE
's, but for this to be efficient, SQL Server
needs to materialize the results of CTE
.
You may try to force it do this with using an ORDER BY
inside a subquery:
WITH f1 AS
(
SELECT TOP 1000000000
A.ColumnX,
A.ColumnY
FROM dbo.TableReturningFunc1(@StaticParam1, @StaticParam2) AS A
ORDER BY
A.key
),
f2 AS
(
SELECT TOP 1000000000
B.ColumnX,
B.ColumnY,
FROM dbo.TableReturningFunc2(@StaticParam1, @StaticParam2) AS B
ORDER BY
B.Key
)
SELECT …
, which may result in Eager Spool
generated by the optimizer.
However, this is far from being guaranteed.
The guaranteed way is to add an OPTION (USE PLAN)
to your query and wrap the correspondind CTE
into the Spool
clause.
See this entry in my blog on how to do that:
This is hard to maintain, since you will need to rewrite your plan each time you rewrite the query, but this works well and is quite efficient.
Using the temp tables will be much easier, though.
This answer needs to be read together with Quassnoi's article
http://explainextended.com/2009/05/28/generating-xml-in-subqueries/
With judicious application of CROSS APPLY, you can force the caching or shortcut evaluation of inline TVFs. This query returns instantaneously.
SELECT *
FROM (
SELECT (
SELECT f.num
FOR XML PATH('fo'), ELEMENTS ABSENT
) AS x
FROM [20090528_tvf].t_integer i
cross apply (
select num
from [20090528_tvf].fn_num(9990) f
where f.num = i.num
) f
) q
--WHERE x IS NOT NULL -- covered by using CROSS apply
FOR XML AUTO
You haven't provided real structures so it's hard to construct something meaningful, but the technique should apply as well.
If you change the multi-statement TVF in Quassnoi's article to an inline TVF, the plan becomes even faster (at least one order of magnitude) and the plan magically reduces to something I cannot understand (it's too simple!).
CREATE FUNCTION [20090528_tvf].fn_num(@maxval INT)
RETURNS TABLE
AS RETURN
SELECT num + @maxval num
FROM t_integer
Statistics
SQL Server parse and compile time:
CPU time = 0 ms, elapsed time = 0 ms.
(10 row(s) affected)
Table 't_integer'. Scan count 2, logical reads 22, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 2 ms.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With