SQL Server CTE referred in self joins slow

Question

I have written a table-valued UDF that starts by a CTE to return a subset of the rows from a large table. There are several joins in the CTE. A couple of inner and one left join to other tables, which don't contain a lot of rows. The CTE has a where clause that returns the rows within a date range, in order to return only the rows needed.

I'm then referencing this CTE in 4 self left joins, in order to build subtotals using different criterias.

The query is quite complex but here is a simplified pseudo-version of it

WITH DataCTE as
(
     SELECT [columns] FROM table
                      INNER JOIN table2
                      ON [...]

                      INNER JOIN table3
                      ON [...]

                      LEFT JOIN table3
                      ON [...]
)
SELECT [aggregates_columns of each subset] FROM DataCTE Main
LEFT JOIN DataCTE BananasSubset
               ON [...] 
             AND Product = 'Bananas'
             AND Quality = 100
LEFT JOIN DataCTE DamagedBananasSubset
               ON [...]
             AND Product = 'Bananas'
             AND Quality < 20
LEFT JOIN DataCTE MangosSubset
               ON [...]
GROUP BY [

I have the feeling that SQL Server gets confused and calls the CTE for each self join, which seems confirmed by looking at the execution plan, although I confess not being an expert at reading those.

I would have assumed SQL Server to be smart enough to only perform the data retrieval from the CTE only once, rather than do it several times.

I have tried the same approach but rather than using a CTE to get the subset of the data, I used the same select query as in the CTE, but made it output to a temp table instead.

The version referring the CTE version takes 40 seconds. The version referring the temp table takes between 1 and 2 seconds.

Why isn't SQL Server smart enough to keep the CTE results in memory?

I like CTEs, especially in this case as my UDF is a table-valued one, so it allowed me to keep everything in a single statement.

To use a temp table, I would need to write a multi-statement table valued UDF, which I find a slightly less elegant solution.

Did some of you had this kind of performance issues with CTE, and if so, how did you get them sorted?

Thanks,

Kharlos

Abe Miessler · Accepted Answer

I believe that CTE results are retrieved every time. With a temp table the results are stored until it is dropped. This would seem to explain the performance gains you saw when you switched to a temp table.

Another benefit is that you can create indexes on a temporary table which you can't do to a cte. Not sure if there would be a benefit in your situation but it's good to know.

SQL Server CTE referred in self joins slow

Tags:

performance

sql-server

temp-tables

common-table-expression

Kharlos Dominguez

1 Answers

Abe Miessler

Recent Activity

Donate For Us

SQL Server CTE referred in self joins slow

Tags:

performance

sql-server

temp-tables

common-table-expression

Kharlos Dominguez

1 Answers

Abe Miessler

Related questions

Recent Activity

Donate For Us