Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

HIVE - INSERT OVERWRITE using WITH CLAUSE

Tags:

hadoop

hive

I have a generated query starts with the WITH clause which is working fine when i am running it in console, when i try to run the query with INSERT OVERWRITE to load the output into a separate hive table

INSERT OVERWRITE TABLE $proc_db.$master_table PARTITION(created_dt, country) $master_query

it throws the following error

cannot recognize input near 'WITH' 't' 'as' in statement

The query as follows:

master_query="
WITH t
AS (
SELECT subscription_id
    ,country
    ,email_type
    ,email_priority
    ,created_dt
FROM crm_arrow.birthday
WHERE created_dt = '2016-07-07'
    AND (COUNTRY = 'SG')
GROUP BY subscription_id
    ,country
    ,email_type
    ,email_priority
    ,created_dt

UNION ALL

SELECT subscription_id
    ,country
    ,email_type
    ,email_priority
    ,created_dt
FROM crm_arrow.wishlist
WHERE created_dt = '2016-07-07'
    AND (COUNTRY = 'SG')
GROUP BY subscription_id
    ,country
    ,email_type
    ,email_priority
    ,created_dt

UNION ALL
.....
)
SELECT q.subscription_id
,q.country
,q.email_type
FROM (
SELECT t1.subscription_id
    ,t1.country
    ,DENSE_RANK() OVER (
        PARTITION BY t1.subscription_id
        ,t1.country ORDER BY t1.email_priority
        ) global_rank
    ,CASE 
        WHEN t1.email_type = t2.email_type
            THEN t1.email_type
        END email_type
FROM t t1
LEFT JOIN t t2 ON t1.country = t2.country
    AND t1.subscription_id = t2.subscription_id
) q
WHERE q.email_type IS NOT NULL
AND (
    q.global_rank <= 2
    AND country = 'SG'
    )
"

How can i make an efficient self join with a huge inner query ? I have also tried to enclose select statement across the master_query but it's still not working.

like image 258
Logan Avatar asked Jul 07 '16 12:07

Logan


2 Answers

It's just where you put you INSERT statement the problem. See here for an example of how to combine INSERT with a WITH clause

CREATE TABLE ramesh_test
(key          BIGINT,
 text_value   STRING,
 roman_value  STRING)
ROW FORMAT DELIMITED 
FIELDS TERMINATED BY '\t' 
LINES TERMINATED BY '\n' 
STORED AS TEXTFILE;

WITH v_text
AS
(SELECT 1 AS key, 'One' AS value),
v_roman
AS
(SELECT 1 AS key, 'I' AS value)
INSERT OVERWRITE TABLE ramesh_test
SELECT v_text.key, v_text.value, v_roman.value
  FROM v_text JOIN v_roman
                ON (v_text.key = v_roman.key);

Position the INSERT above the main SELECT.

Hope this helps!

like image 78
Ramesh Avatar answered Oct 07 '22 23:10

Ramesh


You need to change your query to something like this, so that INSERT OVERWRITE comes before SELECT q.subscription_id clause in your query:-

Please see this sample. Use 1 or multiple with on top, then write INSERT OVERWRITE immediately followed by select query:-

WITH TABLE1 
AS
(
    SELECT 
    cod_index,
    CAST(test_1 AS VARCHAR(200)), 
    CAST(test_2 AS VARCHAR(200)), 
    CAST(test_3 AS VARCHAR(200))
    FROM db_h_gss.tb_h_test_orig
)
INSERT INTO TABLE db_h_gss.tb_h_test_insert PARTITION (cod_index = 1)
SELECT
    test_1,
    test_2,
    test_3
FROM TABLE1 WHERE cod_index = 1;
like image 4
AlphaBetaGamma Avatar answered Oct 08 '22 00:10

AlphaBetaGamma