Bear with me, I'm new to intermediate.
My question is - When should I use a CTE? How do I decide if I should use a CTE?
When should I use this:
;with cteTesting as
(
select *
from testing.first_table
)
select *
from testing.second_table s
inner join cteTesting t
on s.key = t.key
Over this:
select *
from testing.second_table s
inner join
(
select *
from testing.first_table
) t
on s.key = t.key
And why? Is this just for code flow, code readability - or is there something more technical? Will one yield better execution plans under some circumstances?
edit: Just realized my example code example is very poor. I was trying to highlight that there are many cases where I can use a select in the from statement instead of a CTE - how do I decide which one I should be using?
A Common Table Expression (CTE) is the result set of a query which exists temporarily and for use only within the context of a larger query. Much like a derived table, the result of a CTE is not stored and exists only for the duration of the query. This article will focus on non-recurrsive CTEs.
A CTE can be used to: Create a recursive query. For more information, see Recursive Queries Using Common Table Expressions. Substitute for a view when the general use of a view is not required; that is, you do not have to store the definition in metadata.
CTE can be reusable: One advantage of using CTE is CTE is reusable by design. Instead of having to declare the same subquery in every place you need to use it, you can use CTE to define a temporary table once, then refer to it whenever you need it.
The performance of CTEs and subqueries should, in theory, be the same since both provide the same information to the query optimizer. One difference is that a CTE used more than once could be easily identified and calculated once. The results could then be stored and read multiple times.
For simple examples, it doesn't make much difference. If you need to use the Recursive features to build up a hierarchy, then you haven't much choice - you need to use a CTE.
Another case where it probably doesn't make much performance difference, but does for readability, is when you need to join the same subquery multiple times. If you're using subqueries, you have to repeat the entire expression, whereas with CTEs, you just use the name twice:
;With NamedExpression as (
select t1.ID,t2.ID as ID2,SUM(t3.Value) as Val
from
Table1 t1
left join
Table2 t2 on t1.id = t2.t1id
inner join
Table3 t3 on t3.col = t1.id or t3.col2 = t2.id
group by
t1.ID,t2.ID
)
select
*
from
NamedExpression ne
inner join
NamedExpression ne2
on
ne.ID2 = ne2.ID
It should also be noted that if you do the above as subqueries, and the expressions are particularly complex, it can sometimes take time for the reader/maintainer to verify that the two subqueries are in fact identical, and there's not some subtle difference between the two
Also, if you have an indenting style that says that subqueries should appear further to the right than their enclosing query, then expressions that build on other expressions can cause all of the code to shift to the right - whereas with CTEs, you stop and move back to the left in building each subexpression (CTE):
;WITH CTE1 AS (
SELECT
...
), CTE2 as (
SELECT
...
FROM CTE1
), CTE3 as (
SELECT
...
FROM CTE2
)
select * from CTE3
vs:
select *
from
(
select ...
from
(
select ...
from
(
select ...
I personally find the CTE version to be more readable especially if the select gets bigger.
When you use the derived table more than once in the main SELECT it might be better to use the CTE because it tells the database that you want to run this only once. Although I wouldn't be surprised if the optimizer was smart enough to detect two identical sub-selects in the from clause and only runs them once:
with foo as (
select ..
from bar
)
select f1.*
from foo f1
join foo f2 on ...
vs.
select f1.*
from (select ... from bar ) f1
join (select ... from bar) f2 on ...
I think the most important part is to be consistent (across what you write and inside your team).
I noticed that JOINs (esp. when combined with lot of WHERE clauses) can have disastrous performance when large data sets are involved.
CTEs can resolve this by selecting only relevant records and joining over these subsets.
Consider a CTE as a sort of pre-select to prepare data for the final SELECT.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With