My question is around performance and the way a CTE gets evaluated in runtime.
I am planning to reuse code by defining a base projection and then defining multiple CTE's on top of this base projection with different filters.
Does that cause any performance issues.More specifically, does base projection will be evaluated every time.
For example:
WITH CTE_PERSON as (
SELECT * FROM PersonTable
),
CTE_PERSON_WITH_AGE as (
SELECT * FROM CTE_PERSON WHERE age > 24
),
CTE_PERSON_WITH_AGE_AND_GENDER as (
SELECT * FROM CTE_PERSON_WITH_AGE WHERE gender = 'm'
),
CTE_PERSON_WITH_NAME as (
SELECT * FROM CTE_PERSON WHERE name = 'abc'
)
A single scan.
Note:
- a single stage
- a single TableScan
- predicate: (((i = 1) and (j = 2)) and (k = 3)) (type: boolean)
create table t (i int,j int,k int);
explain
with t1 as (select i,j,k from t where i=1)
,t2 as (select i,j,k from t1 where j=2)
,t3 as (select i,j,k from t2 where k=3)
select * from t3
;
Explain
STAGE DEPENDENCIES:
Stage-0 is a root stage
STAGE PLANS:
Stage: Stage-0
Fetch Operator
limit: -1
Processor Tree:
TableScan
alias: t
Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: NONE
Filter Operator
predicate: (((i = 1) and (j = 2)) and (k = 3)) (type: boolean)
Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: NONE
Select Operator
expressions: 1 (type: int), 2 (type: int), 3 (type: int)
outputColumnNames: _col0, _col1, _col2
Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: NONE
ListSink
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With