Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Hive QL Except clause

Tags:

hive

How do I do an EXCEPT clause (like SQL) in Hive QL

I have 2 tables, and each table is a column of unique ids.

I want to find the list of ids that are only in table 1 but not in table 2

Table 1

apple
orange
pear

Table 2

apple
orange

In SQL you can do an EXCEPT clause (http://en.wikipedia.org/wiki/Set_operations_%28SQL%29) but you can't do that in Hive QL

like image 545
sesquipedalians Avatar asked Jun 04 '12 22:06

sesquipedalians


2 Answers

We can use NOT EXISTS clause in Hive as MINUS equivalent.

SELECT t1.id FROM t1 WHERE NOT EXISTS (SELECT 1 from t2 WHERE t2.id = t1.id);
like image 112
kiruba Avatar answered Nov 18 '22 04:11

kiruba


I don't think there's any built-in way to do this but a LEFT OUTER JOIN should do the trick.

This selects all Ids from table1 that do not exist in table2:

SELECT t1.id FROM table1 t1 LEFT OUTER JOIN table2 t2 ON (t1.id=t2.id) WHERE t2.id IS NULL;
like image 44
Lars Francke Avatar answered Nov 18 '22 03:11

Lars Francke