Hash Join vs. Nested Loop

Tags:

I'm new to SQL and I have a question.

I have this SQL code:

DROP TABLE if exists students;
DROP TABLE if exists grades;

CREATE TABLE students(
    s_id integer NOT NULL PRIMARY KEY,
    s_name text,
    s_last_name text,
    curr_year integer
);

CREATE TABLE grades(
    s_id integer NOT NULL PRIMARY KEY,
    course text,
    c_year integer,
    grade integer,
    FOREIGN KEY (s_id) REFERENCES students
);

INSERT INTO students (s_id, s_name, s_last_name, curr_year)
VALUES (1, 'A', 'S', 3);

INSERT INTO students (s_id, s_name, s_last_name, curr_year)
VALUES (2, 'A', 'A', 2);

INSERT INTO students (s_id, s_name, s_last_name, curr_year)
VALUES (3, 'V', 'B', 1);

INSERT INTO students (s_id, s_name, s_last_name, curr_year)
VALUES (4, 'K', 'N', 2);

INSERT INTO grades (s_id, course, c_year, grade)
VALUES (1, 'DB', 2, 98);

INSERT INTO grades (s_id, course, c_year, grade)
VALUES (2, 'OS', 3, 90);

INSERT INTO grades (s_id, course, c_year, grade)
VALUES (3, 'DB', 2, 94);

EXPLAIN ANALYZE
    SELECT * 
    FROM students s JOIN grades gr
    ON s.s_id = gr.s_id
    WHERE curr_year > 0;



CREATE INDEX student_details ON students(s_id, s_name, s_last_name);
CREATE INDEX student_courses ON grades(s_id, course);

EXPLAIN ANALYZE
    SELECT * 
    FROM students s JOIN grades gr
    ON s.s_id = gr.s_id
    WHERE curr_year > 0;


DROP INDEX student_details;
DROP INDEX student_courses;
DROP TABLE students CASCADE;
DROP TABLE grades CASCADE;

And I try to understand the explain outputs. Before the INDEX creating I got hash merge. Here is the explain output:

 Hash Join  (cost=23.50..51.74 rows=270 width=116) (actual time=0.039..0.050 rows=3 loops=1)
   Hash Cond: (gr.s_id = s.s_id)
   ->  Seq Scan on grades gr  (cost=0.00..21.30 rows=1130 width=44) (actual time=0.005..0.008 rows=3 loops=1)
   ->  Hash  (cost=20.12..20.12 rows=270 width=72) (actual time=0.021..0.021 rows=4 loops=1)
         Buckets: 1024  Batches: 1  Memory Usage: 9kB
         ->  Seq Scan on students s  (cost=0.00..20.12 rows=270 width=72) (actual time=0.006..0.011 rows=4 loops=1)
               Filter: (curr_year > 0)
 Planning time: 0.147 ms
 Execution time: 0.089 ms
(9 rows)

And after the INDEX creating I get Nested Loop:

Nested Loop  (cost=0.00..2.12 rows=1 width=116) (actual time=0.031..0.116 rows=3 loops=1)
   Join Filter: (s.s_id = gr.s_id)
   Rows Removed by Join Filter: 9
   ->  Seq Scan on students s  (cost=0.00..1.05 rows=1 width=72) (actual time=0.012..0.018 rows=4 loops=1)
         Filter: (curr_year > 0)
   ->  Seq Scan on grades gr  (cost=0.00..1.03 rows=3 width=44) (actual time=0.003..0.009 rows=3 loops=4)
 Planning time: 0.396 ms
 Execution time: 0.170 ms

But I can't figure out why? Why the indexing in my case made Nested Loop be preferred over Hash Join? I'll be very happy to get an explanation for that.

Thanks a lot!

729

asked Jan 15 '18 15:01

Vipasana

1 Answers

"Nested loop" join is a bit of a misnomer. Technically it refers to a nested loop -- as its name implies:

for every row in table1
    for every row in table2
        compare values and execute join

In practice, this often works as:

for every row in table1
    for every matching row in table2
        execute join

The difference is subtle, but the "matching" means that the nested loop join can make use of an index. So, a nested loop join can have very poor performance (if the tables are relatively large and there are no indexes) or it can have really good performance (if it can make use of an index).

119

answered Oct 14 '22 08:10

Gordon Linoff

Related questions
                            
                                Oracle SQL: Pivot on multiple columns/fields
                            
                                Storing e-mails in a SQL database
                            
                                How to query data based on multiple 'tags' in SQL?
                            
                                MySQL: ERROR 1006 (HY000) Can't create database (errno: 26469527)
                            
                                How can I select the last edited version of a post?
                            
                                Postgres Function to insert multiple records in two tables
                            
                                SQL Server - Convert VarChar to Date in ALTER TABLE
                            
                                Is Merge and Merge join same in SQL Server?
                            
                                BigQuery select * except two columns
                            
                                SQL column with multiple values
                            
                                Is it possible to return multiple rows in a Firebird's stored procedure?
                            
                                How to use same list twice in WHERE clause?
                            
                                SQL to select unique combination of two columns having interchangeable values
                            
                                EF Core / Sqlite one-to-many relationship failing on Unique Index Constraint
                            
                                How to alter Sequence in PostgreSQL with specific max(id) from a table? [duplicate]
                            
                                YYYY-MM column type in PostgreSQL
                            
                                BigQuery compare Date
                            
                                How to select number of record based on a count in a subquery
                            
                                Oracle : sql select query if condition parameter is null then ignore the parameter
                            
                                Get only the data I want from a db but keep structure

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Hash Join vs. Nested Loop

Tags:

sql

postgresql

query-performance

Vipasana

People also ask

1 Answers

Gordon Linoff

Recent Activity

Donate For Us