A problem I've encountered a few times: I have a table, table1, in db1. I have table2 in db2. How do I join between the two?
The obvious thing to do is something like:
SELECT *
FROM db1.table1 INNER JOIN db2.table2
ON db1.table1.field1 = db2.table2.field2;
Hive doesn't like this, however; it starts treating "table1" and "table2" as if they were column names, and "db1" and "db2" as table names, and complaining when they don't exist. How do I join between two tables in different databases?
Basically, for combining specific fields from two tables by using values common to each one we use Hive JOIN clause. In other words, to combine records from two or more tables in the database we use JOIN clause. However, it is more or less similar to SQL JOIN. Also, we use it to combine rows from multiple tables.
SQL Server allows you to join tables from different databases as long as those databases are on the same server. The join syntax is the same; the only difference is that you must fully qualify table names.
SQL Merge Statement Note that, starting from Hive 2.2, merge statement is supported in Hive if you create transaction table. MERGE INTO merge_demo1 A using merge_demo2 B ON ( A.id = b.id ) WHEN matched THEN UPDATE SET A. lastname = B. lastname WHEN NOT matched THEN INSERT (id, firstname, lastname) VALUES (B.id, B.
Apache Hive for Data Engineers (Hands On) JOIN is a clause that is used for combining specific fields from two tables by using values common to each one. It is used to combine records from two or more tables in the database.
An equi-join is a join based on equality or matching column values. This equality is indicated with an equal sign (=) as the comparison operator in the WHERE clause, as the following query shows.
Joins between tables in different databases, in Hive, uniformly require an alias to be set for each {db,table} pair. So instead of the syntax provided in the question, you have to use:
SELECT *
FROM db1.table1 alias1 INNER JOIN db2.table2 alias2
ON alias1.field1 = alias2.field2;
This works. Of course, it's important to remember that if you're asking for particular fields in the SELECT
statement, the aliases apply there too. So:
SELECT db1.table1.field1, db2.table2.field2
becomes:
SELECT alias1.field1, alias2.field2
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With