I have a generated query starts with the WITH clause which is working fine when i am running it in console, when i try to run the query with INSERT OVERWRITE to load the output into a separate hive table
INSERT OVERWRITE TABLE $proc_db.$master_table PARTITION(created_dt, country) $master_query
it throws the following error
cannot recognize input near 'WITH' 't' 'as' in statement
The query as follows:
master_query="
WITH t
AS (
SELECT subscription_id
,country
,email_type
,email_priority
,created_dt
FROM crm_arrow.birthday
WHERE created_dt = '2016-07-07'
AND (COUNTRY = 'SG')
GROUP BY subscription_id
,country
,email_type
,email_priority
,created_dt
UNION ALL
SELECT subscription_id
,country
,email_type
,email_priority
,created_dt
FROM crm_arrow.wishlist
WHERE created_dt = '2016-07-07'
AND (COUNTRY = 'SG')
GROUP BY subscription_id
,country
,email_type
,email_priority
,created_dt
UNION ALL
.....
)
SELECT q.subscription_id
,q.country
,q.email_type
FROM (
SELECT t1.subscription_id
,t1.country
,DENSE_RANK() OVER (
PARTITION BY t1.subscription_id
,t1.country ORDER BY t1.email_priority
) global_rank
,CASE
WHEN t1.email_type = t2.email_type
THEN t1.email_type
END email_type
FROM t t1
LEFT JOIN t t2 ON t1.country = t2.country
AND t1.subscription_id = t2.subscription_id
) q
WHERE q.email_type IS NOT NULL
AND (
q.global_rank <= 2
AND country = 'SG'
)
"
How can i make an efficient self join with a huge inner query ? I have also tried to enclose select statement across the master_query but it's still not working.
It's just where you put you INSERT statement the problem. See here for an example of how to combine INSERT with a WITH clause
CREATE TABLE ramesh_test
(key BIGINT,
text_value STRING,
roman_value STRING)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY '\t'
LINES TERMINATED BY '\n'
STORED AS TEXTFILE;
WITH v_text
AS
(SELECT 1 AS key, 'One' AS value),
v_roman
AS
(SELECT 1 AS key, 'I' AS value)
INSERT OVERWRITE TABLE ramesh_test
SELECT v_text.key, v_text.value, v_roman.value
FROM v_text JOIN v_roman
ON (v_text.key = v_roman.key);
Position the INSERT above the main SELECT.
Hope this helps!
You need to change your query to something like this, so that INSERT OVERWRITE comes before SELECT q.subscription_id clause in your query:-
Please see this sample. Use 1 or multiple with on top, then write INSERT OVERWRITE immediately followed by select query:-
WITH TABLE1
AS
(
SELECT
cod_index,
CAST(test_1 AS VARCHAR(200)),
CAST(test_2 AS VARCHAR(200)),
CAST(test_3 AS VARCHAR(200))
FROM db_h_gss.tb_h_test_orig
)
INSERT INTO TABLE db_h_gss.tb_h_test_insert PARTITION (cod_index = 1)
SELECT
test_1,
test_2,
test_3
FROM TABLE1 WHERE cod_index = 1;
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With