Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why CTE (Common Table Expressions) in some cases slow down queries comparing to temporary tables in SQL Server

I have several cases where my complex CTE (Common Table Expressions) are ten times slower than the same queries using the temporary tables in SQL Server.

My question here is in regards to how SQL Server process the CTE queries, it looks like it tries to join all the separated queries instead of storing the results of each one and then trying to run the following ones. So that might be the reason why it is so faster when using temporary tables.

For example:

Query 1: using Common Table Expression:

;WITH Orders AS
(
    SELECT
        ma.MasterAccountId,
        IIF(r.FinalisedDate IS NULL, 1, 0)) [Status]
    FROM 
        MasterAccount ma
    INNER JOIN 
        task.tblAccounts a ON a.AccountNumber = ma.TaskAccountId 
                           AND a.IsActive = 1
    LEFT OUTER JOIN 
        task.tblRequisitions r ON r.AccountNumber = a.AccountNumber 
    WHERE 
        ma.IsActive = 1
        AND CAST(r.BatchDateTime AS DATE) BETWEEN @fromDate AND @toDate
        AND r.BatchNumber > 0
),
StockAvailability AS
(
    SELECT sa.AccountNumber,
           sa.RequisitionNumber,
           sa.RequisitionDate,
           sa.Lines,
           sa.HasStock,
           sa.NoStock,
           CASE WHEN sa.Lines = 0 THEN 'Empty'
                WHEN sa.HasStock = 0 THEN 'None'
                WHEN (sa.Lines > 0 AND sa.Lines > sa.HasStock) THEN 'Partial'
                WHEN (sa.Lines > 0 AND sa.Lines <= sa.HasStock) THEN 'Full'
            END AS [Status]
    FROM
    (
        SELECT
                r.AccountNumber,
                r.RequisitionNumber,
                r.RequisitionDate,
                COUNT(rl.ProductNumber) Lines,
                SUM(IIF(ISNULL(psoh.AvailableStock, 0) >= ISNULL(rl.Quantity, 0), 1, 0)) AS HasStock,
                SUM(IIF(ISNULL(psoh.AvailableStock, 0) < ISNULL(rl.Quantity, 0), 1, 0)) AS NoStock

        FROM task.tblrequisitions r 
        INNER JOIN task.tblRequisitionLines rl ON rl.RequisitionNumber = r.RequisitionNumber
        LEFT JOIN ProductStockOnHandSummary psoh ON psoh.ProductNumber = rl.ProductNumber

        WHERE dbo.fn_RemoveUnitPrefix(r.BatchNumber) = 0
          AND r.UnitId = 1
          AND r.FinalisedDate IS NULL
          AND r.RequisitionStatus = 1 
          AND r.TransactionTypeNumber = 301 
        GROUP BY r.AccountNumber, r.RequisitionNumber, r.RequisitionDate
    ) AS sa
),
Available AS
(
    SELECT  ma.MasterAccountId,
            SUM(IIF(ma.IsPartialStock = 1,  CASE WHEN sa.[Status] IN ('Full', 'Partial') THEN 1 ELSE 0 END, 
                                            CASE WHEN sa.[Status] = 'Full' THEN 1 ELSE 0 END)) AS AvailableStock,
            SUM(IIF(sa.[Status] IN ('Full', 'Partial', 'None'), 1, 0))  AS OrdersAnyStock, 

            SUM(IIF(sa.RequisitionDate < dbo.TicksToTime(ma.DailyOrderCutOffTime, @toDate),
                    IIF(ma.IsPartialStock = 1,  CASE WHEN sa.[Status] IN ('Full', 'Partial') THEN 1 ELSE 0 END, 
                                                CASE WHEN sa.[Status] = 'Full' THEN 1 ELSE 0 END), 0)) AS AvailableBeforeCutOff                             
    FROM MasterAccount ma
    INNER JOIN StockAvailability sa ON sa.AccountNumber = ma.TaskAccountId
    GROUP BY ma.MasterAccountId, ma.IsPartialStock
),
Totals AS
(
    SELECT 
        o.MasterAccountId,
        COUNT(o.MasterAccountId) AS BatchedOrders
    FROM Orders o
    GROUP BY o.MasterAccountId
)
SELECT a.MasterAccountId,
       ISNULL(t.BatchedOrders, 0) BatchedOrders,
       ISNULL(t.PendingOrders, 0) PendingOrders,
       ISNULL(av.AvailableStock, 0) AvailableOrders,
       ISNULL(av.AvailableBeforeCutOff, 0) AvailableCutOff,
       ISNULL(av.OrdersAnyStock, 0) AllOrders
FROM MasterAccount a
LEFT OUTER JOIN Available av ON av.MasterAccountId = a.MasterAccountId
LEFT OUTER JOIN Totals t ON t.MasterAccountId = a.MasterAccountId
WHERE a.IsActive = 1

Query 2: using temporary tables:

DROP TABLE IF EXISTS #Orders

CREATE TABLE #Orders (MasterAccountId int, [Status] int);

INSERT INTO #Orders
SELECT
    ma.MasterAccountId,
    dbo.fn_GetBatchPickingStatus(ma.BatchPickingOnHold,
                                    iif(r.GroupNumber > 0, 1, 0),
                                    iif(r.FinalisedDate is null, 1, 0)) [Status]
FROM MasterAccount ma (nolock)
INNER JOIN wh3.dbo.tblAccounts a (nolock) on a.AccountNumber = dbo.fn_RemoveUnitPrefix(ma.TaskAccountId) and a.IsActive = 1
LEFT OUTER JOIN wh3.dbo.tblRequisitions r (nolock) on r.AccountNumber = a.AccountNumber 
WHERE cast(r.BatchDateTime as date) between @fromDate and @toDate
    AND r.BatchNumber > 0
    AND ma.IsActive = 1

DROP TABLE IF EXISTS #StockAvailability
Create Table #StockAvailability (AccountNumber int, RequisitionNumber int, RequisitionDate datetime, Lines int, HasStock int, NoStock int);
Insert Into #StockAvailability
SELECT
        r.AccountNumber,
        r.RequisitionNumber,
        r.RequisitionDate,
        COUNT(rl.ProductNumber) Lines,
        SUM(IIF(ISNULL(psoh.AvailableStock, 0) >= ISNULL(rl.Quantity, 0), 1, 0)) AS HasStock,
        SUM(IIF(ISNULL(psoh.AvailableStock, 0) < ISNULL(rl.Quantity, 0), 1, 0)) AS NoStock

FROM WH3.dbo.tblrequisitions r (nolock)
INNER JOIN WH3.dbo.tblRequisitionLines rl (nolock) ON rl.RequisitionNumber = r.RequisitionNumber
LEFT JOIN ProductStockOnHandSummary psoh (nolock) ON psoh.ProductNumber = rl.ProductNumber -- Joined with View          
WHERE r.BatchNumber = 0
    AND r.FinalisedDate is null
    AND r.RequisitionStatus = 1 
    AND r.TransactionTypeNumber = 301 
GROUP BY r.AccountNumber, r.RequisitionNumber, r.RequisitionDate

DROP TABLE IF EXISTS #StockAvailability2
Create Table #StockAvailability2 (AccountNumber int, RequisitionNumber int, RequisitionDate datetime, Lines int, HasStock int, NoStock int, [Status] nvarchar(7));
Insert Into #StockAvailability2
SELECT sa.AccountNumber,
        sa.RequisitionNumber,
        sa.RequisitionDate,
        sa.Lines,
        sa.HasStock,
        sa.NoStock,
        CASE WHEN sa.Lines = 0 THEN 'Empty'
            WHEN sa.HasStock = 0 THEN 'None'
            WHEN (sa.Lines > 0 AND sa.Lines > sa.HasStock) THEN 'Partial'
            WHEN (sa.Lines > 0 AND sa.Lines <= sa.HasStock) THEN 'Full'
        END AS [Status]
FROM #StockAvailability sa

DROP TABLE IF EXISTS #Available
Create Table #Available (MasterAccountId int, AvailableStock int, OrdersAnyStock int, AvailableBeforeCutOff int);
INSERT INTO #Available
SELECT  ma.MasterAccountId,
        SUM(IIF(ma.IsPartialStock = 1,  CASE WHEN sa.[Status] IN ('Full', 'Partial') THEN 1 ELSE 0 END, 
                                        CASE WHEN sa.[Status] = 'Full' THEN 1 ELSE 0 END)) AS AvailableStock,
        SUM(IIF(sa.[Status] IN ('Full', 'Partial', 'None'), 1, 0))  AS OrdersAnyStock, 

        SUM(IIF(sa.RequisitionDate < dbo.TicksToTime(ma.DailyOrderCutOffTime, @toDate),
                IIF(ma.IsPartialStock = 1,  CASE WHEN sa.[Status] IN ('Full', 'Partial') THEN 1 ELSE 0 END, 
                                            CASE WHEN sa.[Status] = 'Full' THEN 1 ELSE 0 END), 0)) AS AvailableBeforeCutOff                             
FROM MasterAccount ma (NOLOCK)
INNER JOIN #StockAvailability2 sa ON sa.AccountNumber = dbo.fn_RemoveUnitPrefix(ma.TaskAccountId)
GROUP BY ma.MasterAccountId, ma.IsPartialStock


;WITH Totals AS
(
    SELECT 
        o.MasterAccountId,
        COUNT(o.MasterAccountId) AS BatchedOrders,
        SUM(IIF(o.[Status] IN (0,1,2), 1, 0)) PendingOrders
    FROM #Orders o (NOLOCK)
    GROUP BY o.MasterAccountId
)
SELECT a.MasterAccountId,
       ISNULL(t.BatchedOrders, 0) BatchedOrders,
       ISNULL(t.PendingOrders, 0) PendingOrders,
       ISNULL(av.AvailableStock, 0) AvailableOrders,
       ISNULL(av.AvailableBeforeCutOff, 0) AvailableCutOff,
       ISNULL(av.OrdersAnyStock, 0) AllOrders
FROM MasterAccount a (NOLOCK)
LEFT OUTER JOIN #Available av (NOLOCK) ON av.MasterAccountId = a.MasterAccountId
LEFT OUTER JOIN Totals t (NOLOCK) ON t.MasterAccountId = a.MasterAccountId
WHERE a.IsActive = 1
like image 560
Roger Oliveira Avatar asked Aug 13 '18 02:08

Roger Oliveira


People also ask

What is the main reason for using a CTE Common table Expression?

CTEs, like database views and derived tables, enable users to more easily write and maintain complex queries via increased readability and simplification. This reduction in complexity is achieved by deconstructing ordinarily complex queries into simple blocks to be used, and reused if necessary, in rewriting the query.

What is difference between CTE and temp table?

Probably the biggest difference between a CTE and a temp table, is that the CTE has an execution scope of a single SELECT, INSERT, UPDATE, DELETE, or CREATE VIEW statement. Essentially you can't reuse the CTE, like you can with temp tables.

Are CTE faster than temp tables?

Looking at SQL Profiler results from these queries (each were run 10 times and averages are below) we can see that the CTE just slightly outperforms both the temporary table and table variable queries when it comes to overall duration.

What is the advantage of CTE over temp table?

It is a temporary result set and typically it may be a result of complex sub-query. Unlike the temporary table, its life is limited to the current query. It is defined by using WITH statement. CTE improves readability and ease in maintenance of complex queries and sub-queries.


2 Answers

The answer is simple.

SQL Server doesn't materialise CTEs. It inlines them, as you can see from the execution plans.

Other DBMS may implement it differently, a well-known example is Postgres, which does materialise CTEs (it essentially creates temporary tables for CTEs behind the hood).

Whether explicit materialisation of intermediary results in explicit temporary tables is faster, depends on the query.

In complex queries the overhead of writing and reading intermediary data into temporary tables can be offset by more efficient simpler execution plans that optimiser is able to generate.

On the other hand, in Postgres CTE is an "optimisation fence" and engine can't push predicates across CTE boundary.

Sometimes one way is better, sometimes another. Once the query complexity grows beyond certain threshold an optimiser can't analyse all possible ways to process the data and it has to settle on something. For example, the order in which to join the tables. The number of permutations grows exponentially with the number of tables to choose from. Optimiser has limited time to generate a plan, so it may make a poor choice when all CTEs are inlined. When you manually break complex query into smaller simpler ones you need to understand what you are doing, but optimiser has a better chance to generate a good plan for each simple query.

like image 139
Vladimir Baranov Avatar answered Oct 24 '22 19:10

Vladimir Baranov


There are different use cases for the two, and different advantages/disadvantages.

Common Table Expressions

Common Table Expressions should be viewed as expressions, not tables. As expressions, the CTE does not need to be instantiated, so the query optimizer can fold it into the rest of the query, and optimize the combination of the CTE and the rest of the query.

Temporary Tables

With temporary tables, the results of the query are stored in a real live table, in the temp database. The query results can then be reused in multiple queries, unlike CTEs, where the CTE, if used in multiple separate queries, would have to be a part of the work plan in each of those separate queries.

Also, a temporary table can have an index, keys, etc. Adding these to a temp table can be a great assistance in optimizing some queries, and is unavailable in the CTE, though the CTE can utilize the indexes and keys in the tables underlying the CTE.

If the underlying tables to a CTE don't support the type of optimizations you need, a temp table may be better.

like image 39
rsjaffe Avatar answered Oct 24 '22 18:10

rsjaffe