Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Difference between INNER JOIN and LEFT SEMI JOIN

Tags:

sql

hql

hive

What is the difference between an INNER JOIN and LEFT SEMI JOIN?

In the scenario below, why am I getting two different results?

The INNER JOIN result set is a lot larger. Can someone explain? I am trying to get the names within table_1 that only appear in table_2.

SELECT name FROM table_1 a     INNER JOIN table_2 b ON a.name=b.name  SELECT name FROM table_1 a     LEFT SEMI JOIN table_2 b ON (a.name=b.name) 
like image 882
user3023355 Avatar asked Feb 12 '14 20:02

user3023355


People also ask

Is a left semi join faster than inner join?

Use LEFT SEMI JOIN if you want to list the matching record from the left hand side table only once for each matching record in the right hand side. LEFT SEMI JOIN is better performant when compared to the INNER JOIN.

What is a left semi join?

A LEFT SEMIJOIN (or just SEMIJOIN ) gives only those rows in the left rowset that have a matching row in the right rowset. The RIGHT SEMIJOIN gives only those rows in the right rowset that have a matching row in the left rowset. The join expression in the ON clause specifies how to determine the match.

What is the difference between left right and inner join?

INNER JOIN: returns rows when there is a match in both tables. LEFT JOIN: returns all rows from the left table, even if there are no matches in the right table. RIGHT JOIN: returns all rows from the right table, even if there are no matches in the left table.

What is left semi join in spark?

Semi Join. A semi join returns values from the left side of the relation that has a match with the right. It is also referred to as a left semi join.


1 Answers

An INNER JOIN can return data from the columns from both tables, and can duplicate values of records on either side have more than one match. A LEFT SEMI JOIN can only return columns from the left-hand table, and yields one of each record from the left-hand table where there is one or more matches in the right-hand table (regardless of the number of matches). It's equivalent to (in standard SQL):

SELECT name FROM table_1 a WHERE EXISTS(     SELECT * FROM table_2 b WHERE (a.name=b.name)) 

If there are multiple matching rows in the right-hand column, an INNER JOIN will return one row for each match on the right table, while a LEFT SEMI JOIN only returns the rows from the left table, regardless of the number of matching rows on the right side. That's why you're seeing a different number of rows in your result.

I am trying to get the names within table_1 that only appear in table_2.

Then a LEFT SEMI JOIN is the appropriate query to use.

like image 89
D Stanley Avatar answered Sep 18 '22 17:09

D Stanley