Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Unexpected result of NATURAL JOIN and USING Clause

So I am learning about obtaining data from multiple tables and I have a question regarding NATURAL JOIN and the USING clause. So I have 2 tables that I'm extracting data from; employees and departments.

SQL> describe employees 
Name                                      Null?    Type
 ----------------------------------------- -------- ------------------------
 EMPLOYEE_ID                                        NUMBER(6)
 FIRST_NAME                                         VARCHAR2(20)
 LAST_NAME                                 NOT NULL VARCHAR2(25)
 EMAIL                                     NOT NULL VARCHAR2(25)
 PHONE_NUMBER                                       VARCHAR2(20)
 HIRE_DATE                                 NOT NULL DATE
 JOB_ID                                    NOT NULL VARCHAR2(10)
 SALARY                                             NUMBER(8,2)
 COMMISSION_PCT                                     NUMBER(2,2)
 MANAGER_ID                                         NUMBER(6)
 DEPARTMENT_ID                                      NUMBER(4)

SQL> describe departments
 Name                                      Null?    Type
 ----------------------------------------- -------- ------------------------
 DEPARTMENT_ID                             NOT NULL NUMBER(4)
 DEPARTMENT_NAME                                    VARCHAR2(30)
 MANAGER_ID                                         NUMBER(6)
 LOCATION_ID                                        NUMBER(4)

When I use NATURAL JOIN and USING in two different expressions, I have two different outputs. I know that USING matches specifically one column in both tables, but how does this affect the output? How come the expression with USING produces one extra value compared to the NATURAL JOIN?

SELECT department_id, manager_id, last_name, location_id
FROM employees NATURAL JOIN departments
WHERE department_id = 80
ORDER BY location_id desc;

DEPARTMENT_ID MANAGER_ID LAST_NAME                 LOCATION_ID
------------- ---------- ------------------------- -----------
       80        149 Abel                             2500
       80        149 Grant                            2500
       80        149 Taylor                           2500

SELECT department_id, departments.manager_id, last_name, location_id
FROM employees JOIN departments
USING (department_id)
WHERE department_id = 80
ORDER BY location_id desc;

DEPARTMENT_ID MANAGER_ID LAST_NAME                 LOCATION_ID
------------- ---------- ------------------------- -----------
       80        149 Zlotkey                         2500 <-Additional Line*
       80        149 Grant                           2500
       80        149 Taylor                          2500
       80        149 Abel                            2500

Any help and advice is appreciated!

like image 758
jbone Avatar asked Sep 05 '25 02:09

jbone


1 Answers

Because NATURAL JOIN joins on all columns with equal names, hence your NJ query will join on manager_id as well, whereas the USING form will not

If you examine the USING form you’ll see that the manager_id in each table is different. (Make your query SELECT * and you’ll see..)

When natural joined this difference in manager_id between the two tables will remove the row from the results

It’s probably worth noting that natural join and using should be seldom used. For sure learn about them so you can appreciate their existence, but you should stick to using regular explicit joins for reasons of consistent and clear behaviour - if the app is upgraded in future and two new columns are added to these two tables, named the same but will different data the NJ will fail.

Edit:

Run this:

SELECT *
FROM employees JOIN departments
USING (department_id)
WHERE department_id = 80
ORDER BY location_id desc;

Look at the two columns showing manager ID. The data on the Zlotkey row will be different

You natural join effectively did this:

SELECT *
FROM employees e JOIN departments d
on e.department_id = d.department_id AND e.manager_id = d.manager_id
WHERE department_id = 80
ORDER BY location_id desc;

And on e.department_id = d.department_id AND e.manager_id = d.manager_id is not true for the zlotkey row

This is why natural join is risky and possibly useless - just because two columns have the same name doesn’t mean the data in them is related. Indeed in your case your departments have managers and your employees have managers. I also have this setup at my work, but the man who is my manager is not the manager of the department I work in

like image 113
Caius Jard Avatar answered Sep 07 '25 20:09

Caius Jard