Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

HOW CTE (Common Table Expression) in HIVE gets evaluated

My question is around performance and the way a CTE gets evaluated in runtime.

I am planning to reuse code by defining a base projection and then defining multiple CTE's on top of this base projection with different filters.

Does that cause any performance issues.More specifically, does base projection will be evaluated every time.

For example:

WITH CTE_PERSON as (
   SELECT * FROM PersonTable
),


CTE_PERSON_WITH_AGE as (
   SELECT * FROM CTE_PERSON WHERE age > 24 
),

CTE_PERSON_WITH_AGE_AND_GENDER as (
  SELECT * FROM CTE_PERSON_WITH_AGE WHERE gender = 'm'
),

CTE_PERSON_WITH_NAME as (
  SELECT * FROM CTE_PERSON WHERE name = 'abc'
)
  • Does every time all the entries from PersonTable will get loaded into memory and then filters will be applied after (or)
  • Only Result set after filters will be loaded into memory.
like image 817
pavan kumar reddy Avatar asked Feb 27 '17 12:02

pavan kumar reddy


1 Answers

A single scan.

Note:
- a single stage
- a single TableScan
- predicate: (((i = 1) and (j = 2)) and (k = 3)) (type: boolean)


create table t (i int,j int,k int);

explain 
with    t1 as (select i,j,k from t  where i=1)
       ,t2 as (select i,j,k from t1 where j=2)
       ,t3 as (select i,j,k from t2 where k=3) 

select * from t3
;

Explain
STAGE DEPENDENCIES:
  Stage-0 is a root stage

STAGE PLANS:
  Stage: Stage-0
    Fetch Operator
      limit: -1
      Processor Tree:
        TableScan
          alias: t
          Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: NONE
          Filter Operator
            predicate: (((i = 1) and (j = 2)) and (k = 3)) (type: boolean)
            Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: NONE
            Select Operator
              expressions: 1 (type: int), 2 (type: int), 3 (type: int)
              outputColumnNames: _col0, _col1, _col2
              Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: NONE
              ListSink
like image 125
David דודו Markovitz Avatar answered Dec 06 '22 09:12

David דודו Markovitz