NATURAL JOIN vs WHERE IN Clauses

Tags:

Recently, I dealt with retrieving a large amount of data which consists of thousands of records from a MySQL database. Since it was my first time to handle such large data set, I didn't think about the efficiency of the SQL statement. And the problem comes.

Here are the tables of the database (It is just a simple database model of a curriculum system):

course:

+-----------+---------------------+------+-----+---------+----------------+
| Field     | Type                | Null | Key | Default | Extra          |
+-----------+---------------------+------+-----+---------+----------------+
| course_id | int(10) unsigned    | NO   | PRI | NULL    | auto_increment |
| name      | varchar(20)         | NO   |     | NULL    |                |
| lecturer  | varchar(20)         | NO   |     | NULL    |                |
| credit    | float               | NO   |     | NULL    |                |
| week_from | tinyint(3) unsigned | NO   |     | NULL    |                |
| week_to   | tinyint(3) unsigned | NO   |     | NULL    |                |
+-----------+---------------------+------+-----+---------+----------------+

select:

+-----------+------------------+------+-----+---------+----------------+
| Field     | Type             | Null | Key | Default | Extra          |
+-----------+------------------+------+-----+---------+----------------+
| select_id | int(10) unsigned | NO   | PRI | NULL    | auto_increment |
| card_no   | int(10) unsigned | NO   |     | NULL    |                |
| course_id | int(10) unsigned | NO   |     | NULL    |                |
| term      | varchar(7)       | NO   |     | NULL    |                |
+-----------+------------------+------+-----+---------+----------------+

When I want to retrieve all the courses that a student has selected (with his card number), the SQL statement is

SELECT course_id, name, lecturer, credit, week_from, week_to
FROM `course` WHERE course_id IN (
    SELECT course_id FROM `select` WHERE card_no=<student's card number>
);

But, it was extremely slow and it didn't return anything for a long time. So I changed WHERE IN clauses into NATURAL JOIN. Here is the SQL,

SELECT course_id, name, lecturer, credit, week_from, week_to
FROM `select` NATURAL JOIN `course`
WHERE card_no=<student's card number>;

It returns immediately and works fine!

So my question is:

What's the difference between NATURAL JOIN and WHERE IN Clauses?
What makes them perform differently? (Is that maybe because I doesn't set up any INDEX?)
When shall we use NATURAL JOIN or WHERE IN?

715

asked Apr 14 '13 06:04

Wenhao Ji

2 Answers

Theoretically the two queries are equivalent. I think it's just poor implementation of the MySQL query optimizer that causes JOIN to be more efficient than WHERE IN. So I always use JOIN.

Have you looked at the output of EXPLAIN for the two queries? Here's what I got for a WHERE IN:

+----+--------------------+-------------------+----------------+-------------------+---------+---------+------------+---------+--------------------------+
|  1 | PRIMARY            | t_users           | ALL            | NULL              | NULL    | NULL    | NULL       | 2458304 | Using where              |
|  2 | DEPENDENT SUBQUERY | t_user_attributes | index_subquery | PRIMARY,attribute | PRIMARY | 13      | func,const |       7 | Using index; Using where |
+----+--------------------+-------------------+----------------+-------------------+---------+---------+------------+---------+--------------------------+

It's apparently performing the subquery, then going through every row in the main table testing whether it's in -- it doesn't use the index. For the JOIN I get:

+----+-------------+-------------------+--------+---------------------+-----------+---------+---------------------------------------+------+-------------+
| id | select_type | table             | type   | possible_keys       | key       | key_len | ref                                   | rows | Extra       |
+----+-------------+-------------------+--------+---------------------+-----------+---------+---------------------------------------+------+-------------+
|  1 | SIMPLE      | t_user_attributes | ref    | PRIMARY,attribute   | attribute | 1       | const                                 |   15 | Using where |
|  1 | SIMPLE      | t_users           | eq_ref | username,username_2 | username  | 12      | bbodb_test.t_user_attributes.username |    1 |             |
+----+-------------+-------------------+--------+---------------------+-----------+---------+---------------------------------------+------+-------------+

Now it uses the index.

answered Oct 14 '22 14:10

Barmar

Try this:

SELECT course_id, name, lecturer, credit, week_from, week_to
FROM `course` c
WHERE c.course_id IN (
    SELECT s.course_id 
    FROM `select` s
    WHERE card_no=<student's card number>
    AND   c.course_id = s.course_id
);

Notice the addition of the AND clause in the sub-query. This is called a co-related sub-query because it relates the two course_ids, just as the NATURAL JOIN does.

I think Barmar's index explanation is on the mark.

answered Oct 14 '22 15:10

Carl

Related questions
                            
                                How to improve INSERT performance on a very large MySQL table
                            
                                SELECT command denied to user 'user'@'localhost' for table 'table'
                            
                                Lock wait timeout exceeded; try restarting transaction using JDBC
                            
                                Tomcat 8 - java.sql.SQLException: Cannot create JDBC driver of class '' for connect URL 'jdbc:mysql://xxx/myApp'
                            
                                Runtime error when using N-1 in limit clause of select query when N=0
                            
                                Drupal: MySQL SELECT all posts belonging to a specific forum
                            
                                MySQL/Hibernate - How do I debug a MySQL pooled connection that keeps dropping?
                            
                                Mysql function MBRContains is not accurate
                            
                                Switching between multiple databases in Rails without breaking transactions
                            
                                laravel 5.2 Insert login session data to database on custom auth
                            
                                MySQL Server restarts after trigger execution
                            
                                How to select maximum 3 items per users in MySQL?
                            
                                MySQL Triggers - AFTER INSERT trigger + UDF sys_exec() issue
                            
                                How can I make fatal errors of ALL mysql warnings?
                            
                                Storing image data in a MySQL database?
                            
                                Saving Python Pickled objects in MySQL db
                            
                                resetting mysql workbench root password
                            
                                What is best practice when it comes to storing images for a gallery?
                            
                                Is this possible to get total number of rows count with offset limit
                            
                                PHP slow with mysqli

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With