T-SQL bad performance with CTE

Tags:

I have a performance question about Common table expressions in SQL Server. In our developer team we use a lot of chaining CTEs when building our queries. I am currently working on a query which had terrible performance. But I found out that if I in the middle of the chain inserted all the records up to that CTE in a temporary table instead and then continued but selecting from that temp table I improved performance significantly. Now I would like to get some help to understand if this type of change only applies to this specific query and why the two cases you will see below differ so much in performance. Or could we possibly overuse CTEs in our team and can we gain performance generally by learning from this case?

Please try to explain to me exactly what is happening here...

The code is complete and you will be able to run it on SQL Server 2008 and probably 2005 too. One part is commented out and my idea is that you can switch the two cases by comment out one or the other. You can see where to put your block comments, I have marked these places with --block comment here and --end block comment here

It is the slow performing case that is uncommented default. Here you are:

--Declare tables to use in example.
CREATE TABLE #Preparation 
(
    Date DATETIME NOT NULL
    ,Hour INT NOT NULL
    ,Sales NUMERIC(9,2)
    ,Items INT
);

CREATE TABLE #Calendar
(
    Date DATETIME NOT NULL
)

CREATE TABLE #OpenHours
(
    Day INT NOT NULL,
    OpenFrom TIME NOT NULL,
    OpenTo TIME NOT NULL
);

--Fill tables with sample data.
INSERT INTO #OpenHours (Day, OpenFrom, OpenTo)
VALUES
    (1, '10:00', '20:00'),
    (2, '10:00', '20:00'),
    (3, '10:00', '20:00'),
    (4, '10:00', '20:00'),
    (5, '10:00', '20:00'),
    (6, '10:00', '20:00'),
    (7, '10:00', '20:00')

DECLARE @CounterDay INT = 0, @CounterHour INT = 0, @Sales NUMERIC(9, 2), @Items INT;

WHILE @CounterDay < 365
BEGIN
    SET @CounterHour = 0;
    WHILE @CounterHour < 5
    BEGIN
        SET @Items = CAST(RAND() * 100 AS INT);
        SET @Sales = CAST(RAND() * 1000 AS NUMERIC(9, 2));
        IF @Items % 2 = 0
        BEGIN
            SET @Items = NULL;
            SET @Sales = NULL;
        END

        INSERT INTO #Preparation (Date, Hour, Items, Sales)
        VALUES (DATEADD(DAY, @CounterDay, '2011-01-01'), @CounterHour + 13, @Items, @Sales);

        SET @CounterHour += 1;
    END
    INSERT INTO #Calendar (Date) VALUES (DATEADD(DAY, @CounterDay, '2011-01-01'));
    SET @CounterDay += 1;
END

--Here the query starts.
;WITH P AS (
    SELECT DATEADD(HOUR, Hour, Date) AS Hour
        ,Sales
        ,Items
    FROM #Preparation
),
O AS (
        SELECT DISTINCT DATEADD(HOUR, SV.number, C.Date) AS Hour
        FROM #OpenHours AS O
            JOIN #Calendar AS C ON O.Day = DATEPART(WEEKDAY, C.Date)
            JOIN master.dbo.spt_values AS SV ON SV.number BETWEEN DATEPART(HOUR, O.OpenFrom) AND DATEPART(HOUR, O.OpenTo)
),
S AS (
    SELECT O.Hour, P.Sales, P.Items
    FROM O
        LEFT JOIN P ON P.Hour = O.Hour
)

--block comment here case 1 (slow performing)
--With this technique it takes about 34 seconds.
,N AS (
        SELECT  
            A.Hour
            ,A.Sales AS SalesOrg
            ,CASE WHEN COALESCE(B.Sales, C.Sales, 1) < 0
                THEN 0 ELSE COALESCE(B.Sales, C.Sales, 1) END AS Sales
            ,A.Items AS ItemsOrg
            ,COALESCE(B.Items, C.Items, 1) AS Items
        FROM S AS A
        OUTER APPLY (SELECT TOP 1 *
                     FROM S
                     WHERE Hour <= A.Hour
                        AND Sales IS NOT NULL
                        AND DATEDIFF(DAY, Hour, A.Hour) = 0                      
                     ORDER BY Hour DESC) B
        OUTER APPLY (SELECT TOP 1 *
                     FROM S
                     WHERE Sales IS NOT NULL
                        AND DATEDIFF(DAY, Hour, A.Hour) = 0
                     ORDER BY Hour) C
    )
--end block comment here case 1 (slow performing)

/*--block comment here case 2 (fast performing)
--With this technique it takes about 2 seconds.
SELECT * INTO #tmpS FROM S;

WITH
N AS (
        SELECT  
            A.Hour
            ,A.Sales AS SalesOrg
            ,CASE WHEN COALESCE(B.Sales, C.Sales, 1) < 0
                THEN 0 ELSE COALESCE(B.Sales, C.Sales, 1) END AS Sales
            ,A.Items AS ItemsOrg
            ,COALESCE(B.Items, C.Items, 1) AS Items
        FROM #tmpS AS A
        OUTER APPLY (SELECT TOP 1 *
                     FROM #tmpS
                     WHERE Hour <= A.Hour
                        AND Sales IS NOT NULL
                        AND DATEDIFF(DAY, Hour, A.Hour) = 0                      
                     ORDER BY Hour DESC) B
        OUTER APPLY (SELECT TOP 1 *
                     FROM #tmpS
                     WHERE Sales IS NOT NULL
                        AND DATEDIFF(DAY, Hour, A.Hour) = 0
                     ORDER BY Hour) C
    )
--end block comment here case 2 (fast performing)*/
SELECT * FROM N ORDER BY Hour


IF OBJECT_ID('tempdb..#tmpS') IS NOT NULL DROP TABLE #tmpS;

DROP TABLE #Preparation;
DROP TABLE #Calendar;
DROP TABLE #OpenHours;

If you would like to try and understand what I am doing in the last step I have a SO question about it here.

For me case 1 takes about 34 seconds and case 2 takes about 2 seconds. The difference is that I store the result from S in a temp table in case 2, in case 1 I use S in my next CTE directly.

473

asked May 25 '12 09:05

John

1 Answers

A CTE is essentially just a disposable view. It will pretty much never make a query any faster than just putting the CTE code into a FROM clause as a table expression.

In your example, the real issue is the date functions I believe.

Your first (slow) case requires the date functions to be run for every row.

For your second (faster) case they are run once and stored into a table.

This is not normally so noticeable unless you do some sort of logic on the function-derived field. In your case you are doing an ORDER BY on Hour, which is very costly. In your second example it's a simple sort on a field, but in the first you are running that function for each row, THEN sorting.

For a lot more in-depth reading on CTEs, see this question on DBA.SE.

answered Sep 17 '22 12:09

JNK

Related questions
                            
                                TSQL logging inside transaction
                            
                                Receiving error "Select statements included within a function cannot return data to a client"
                            
                                How can I change the connection for current tab in Microsoft SQL Server Management Studio
                            
                                REST API Architecture: How to Represent Joined Tables
                            
                                Order of (nolock) and table aliases on SQL Server
                            
                                SQL Server - Dynamic PIVOT Table - SQL Injection
                            
                                Paste text with newlines in it in Sql Server management studio 2008 'edit top 200 rows' window in a varchar(200) column
                            
                                SQL Server 2012 error: object reference not set to an instance of an object
                            
                                creating triggers for After Insert, After Update and After Delete in SQL
                            
                                Query to display spent credits from transactional table
                            
                                Transactionscope throwing exception this platform does not support distributed transactions while opening connection object
                            
                                Which lock hints should I use (T-SQL)?
                            
                                ZombieCheck Exception - This SqlTransaction has completed; it is no longer usable -- during simple commit
                            
                                Why is SQL Server 2008 blocking SELECT's on long transaction INSERT's?
                            
                                Large SQL transaction: runs out of memory on PostgreSQL, yet works on SQL Server
                            
                                Introducing FOREIGN KEY constraint may cause cycles or multiple cascade paths. Specify on delete no action
                            
                                "public" role access in SQL Server
                            
                                Prevent mutually recursive execution of triggers?
                            
                                Entity Framework: How to properly handle exceptions that occur due to SQL constraints
                            
                                How can I make SQL Server return FALSE for comparing varchars with and without trailing spaces?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

T-SQL bad performance with CTE

Tags:

performance

sql-server

tsql

common-table-expression

John

People also ask

1 Answers

JNK

Recent Activity

Donate For Us