Is there a performance difference between CTE , Sub-Query, Temporary Table or Table Variable?

Tags:

In this excellent SO question, differences between CTE and sub-queries were discussed.

I would like to specifically ask:

In what circumstance is each of the following more efficient/faster?

CTE
Sub-Query
Temporary Table
Table Variable

Traditionally, I've used lots of temp tables in developing stored procedures - as they seem more readable than lots of intertwined sub-queries.

Non-recursive CTEs encapsulate sets of data very well, and are very readable, but are there specific circumstances where one can say they will always perform better? or is it a case of having to always fiddle around with the different options to find the most efficient solution?

EDIT

I've recently been told that in terms of efficiency, temporary tables are a good first choice as they have an associated histogram i.e. statistics.

697

asked Jun 23 '12 12:06

whytheq

2 Answers

SQL is a declarative language, not a procedural language. That is, you construct a SQL statement to describe the results that you want. You are not telling the SQL engine how to do the work.

As a general rule, it is a good idea to let the SQL engine and SQL optimizer find the best query plan. There are many person-years of effort that go into developing a SQL engine, so let the engineers do what they know how to do.

Of course, there are situations where the query plan is not optimal. Then you want to use query hints, restructure the query, update statistics, use temporary tables, add indexes, and so on to get better performance.

As for your question. The performance of CTEs and subqueries should, in theory, be the same since both provide the same information to the query optimizer. One difference is that a CTE used more than once could be easily identified and calculated once. The results could then be stored and read multiple times. Unfortunately, SQL Server does not seem to take advantage of this basic optimization method (you might call this common subquery elimination).

Temporary tables are a different matter, because you are providing more guidance on how the query should be run. One major difference is that the optimizer can use statistics from the temporary table to establish its query plan. This can result in performance gains. Also, if you have a complicated CTE (subquery) that is used more than once, then storing it in a temporary table will often give a performance boost. The query is executed only once.

The answer to your question is that you need to play around to get the performance you expect, particularly for complex queries that are run on a regular basis. In an ideal world, the query optimizer would find the perfect execution path. Although it often does, you may be able to find a way to get better performance.

answered Sep 19 '22 15:09

Gordon Linoff

There is no rule. I find CTEs more readable, and use them unless they exhibit some performance problem, in which case I investigate the actual problem rather than guess that the CTE is the problem and try to re-write it using a different approach. There is usually more to the issue than the way I chose to declaratively state my intentions with the query.

There are certainly cases when you can unravel CTEs or remove subqueries and replace them with a #temp table and reduce duration. This can be due to various things, such as stale stats, the inability to even get accurate stats (e.g. joining to a table-valued function), parallelism, or even the inability to generate an optimal plan because of the complexity of the query (in which case breaking it up may give the optimizer a fighting chance). But there are also cases where the I/O involved with creating a #temp table can outweigh the other performance aspects that may make a particular plan shape using a CTE less attractive.

Quite honestly, there are way too many variables to provide a "correct" answer to your question. There is no predictable way to know when a query may tip in favor of one approach or another - just know that, in theory, the same semantics for a CTE or a single subquery should execute the exact same. I think your question would be more valuable if you present some cases where this is not true - it may be that you have discovered a limitation in the optimizer (or discovered a known one), or it may be that your queries are not semantically equivalent or that one contains an element that thwarts optimization.

So I would suggest writing the query in a way that seems most natural to you, and only deviate when you discover an actual performance problem the optimizer is having. Personally I rank them CTE, then subquery, with #temp table being a last resort.

answered Sep 21 '22 15:09

Aaron Bertrand

Related questions
                            
                                Convert timestamp to date in MySQL query
                            
                                Case insensitive searching in Oracle
                            
                                How do I convert from BLOB to TEXT in MySQL?
                            
                                SQL - HAVING vs. WHERE
                            
                                How to update Identity Column in SQL Server?
                            
                                Does MS SQL Server's "between" include the range boundaries?
                            
                                Convert Datetime column from UTC to local time in select statement
                            
                                When to use Common Table Expression (CTE)
                            
                                PostgreSQL delete with inner join
                            
                                Count the occurrences of DISTINCT values
                            
                                How to write UPDATE SQL with Table alias in SQL Server 2008?
                            
                                How to round an average to 2 decimal places in PostgreSQL?
                            
                                Rounding off to two decimal places in SQL
                            
                                Check if a row exists, otherwise insert
                            
                                How do I create a foreign key in SQL Server?
                            
                                SQL JOIN and different types of JOINs
                            
                                Fastest way to count exact number of rows in a very large table?
                            
                                SQL WITH clause example [duplicate]
                            
                                How to speed up insertion performance in PostgreSQL
                            
                                How to list active connections on PostgreSQL?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Is there a performance difference between CTE , Sub-Query, Temporary Table or Table Variable?

Tags:

sql

sql-server

tsql

subquery

common-table-expression

whytheq

People also ask

2 Answers

Gordon Linoff

Aaron Bertrand

Recent Activity

Donate For Us