What makes an SQL query optimiser decide between a nested loop and a hash join

2 Answers

NESTED LOOPS are good if the condition inside the loop is sargable, that is index can be used to limit the number of records.

For a query like this:

SELECT  *
FROM    a
JOIN    b
ON      b.b1 = a.a1
WHERE   a.a2 = @myvar

, with a leading, each record from a will be taken and all corresponding records in b should be found.

If b.b1 is indexed and has high cardinality, then NESTED LOOP will be a preferred way.

In SQL Server, it is also the only way to execute non-equijoins (something other than = condition in the ON clause)

HASH JOIN is the fastest method if all (or almost all) records should be parsed.

It takes all records from b, builds a hash table over them, then takes all records from a and uses the value of the join column as a key to look up the hash table.

NESTED LOOPS takes this time:

Na * (Nb / C) * R,

where Na and Nb are the numbers of records in a and b, C is the index cardinality, and R is constant time required for the row lookup (1 is all fields in SELECT, WHERE and ORDER BY clauses are covered by the index, about 10 if they are not)
HASH JOIN takes this time:

Na + (Nb * H)

, where H is sum of constants required to build and lookup the hash table (per record). They are programmed into the engine.

SQL Server computes the cardinality using the table statistics, computes and compares the two values and chooses the best plan.

answered Oct 20 '22 01:10

Quassnoi

Typically, it's going to be dependent on the size of the sets that are being joined.

I highly recommend reading "Inside Microsoft SQL Server 2008: T-SQL Querying" by Itzik Ben-Gan:

http://www.solidq.com/insidetsql/books/insidetsql2008/

(the 2005 edition is just as applicable on this topic as well)

He goes into your question, as well as many others when it comes to getting the most out of your queries.

answered Oct 19 '22 23:10

casperOne

Related questions
                            
                                PHP - How to substitute array as host parameter in prepared statement
                            
                                Oracle how to convert time in UTC to the local time (offset information is missing)
                            
                                What's the Grain in the context of DW
                            
                                sql unique records puzzle
                            
                                Will Spark SQL completely replace Apache Impala or Apache Hive?
                            
                                Replace first occurrence of '.' in sql String
                            
                                CASE statements in Hive
                            
                                How can I quickly detect and resolve SQL Server Index fragmentation for a database?
                            
                                Missing semicolons at line-end of JPA-generated sql script
                            
                                Filling missing dates in BigQuery (SQL) without creating a new calendar
                            
                                Search json array using SQL server JSON_VALUE
                            
                                SQL query to get the top "n" scores out of a list
                            
                                How to make a sql search query more powerful?
                            
                                Inserting a row into DB2 from a sub-select - NULL error
                            
                                SQL Server Index performance - long column
                            
                                Add a field description to a DB2/400 file
                            
                                SQL: finding longest date gap
                            
                                how to speed up Mysql and PHP?
                            
                                SQL Server does not handle comparison of NText, Text, Xml, or Image data types
                            
                                Function call in where clause

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

What makes an SQL query optimiser decide between a nested loop and a hash join

Tags:

performance

sql

cindi

People also ask

2 Answers

Quassnoi

casperOne

Recent Activity

Donate For Us