Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Order of tables in join query

Tags:

sql

join

oracle

I found this paragraph in Oracle documentation

if you want to select the name of each department along with the name of its manager, you can write the query in one of two ways. In the first example which follows, the hint /++ordered++/ says to do the join in the order the tables appear in the FROM clause with attempting to optimize the join order.

SELECT /*+ordered*/ d.NAME, e.NAME
FROM DEPT d, EMP e WHERE d.MGR = e.SS#

or:

SELECT /*+ordered*/ d.NAME, e.NAME 
FROM EMP e, DEPT d WHERE d.MGR = e.SS# 

Suppose that there are 10 departments and 1000 employees, and that the inner table in each query has an index on the join column. In the first query, the first table produces 10 qualifying rows (in this case, the whole table). In the second query, the first table produces 1000 qualifying rows. The first query will access the EMP table 10 times and scan the DEPT table once. The second query will scan the EMP table once but will access the DEPT table 1000 times. Therefore the first query will perform much better. As a rule of thumb, tables should be arranged from smallest effective number rows to largest effective number of rows. The effective row size of a table in a query is obtained by applying the logical conditions that are resolved entirely on that table.

But I don't correctly understand this. If there are m rows in table t1 and n rows in table t2, wouldn't the sql engine go through m x n rows in both cases?

Update: Thanks for all the replies. I won't be overriding the optimizer, just wanted to confirm my thought.

like image 651
Jerry Avatar asked Oct 20 '10 10:10

Jerry


2 Answers

Well, in the first case the number of logical reads is 10+10, in the second 1000+1000, with each department being read on average 100 times.

However, writing queries with the ORDERED hitn like this is not normal practice. It's best to leave optimisation to the optimiser most of the time.

I'm not sure exactly which documentation you got that quote from, but where I have seen it it is preceded by this very important paragraph that you omitted. I quote it here for the benefit of others who may otherwise think this method of writing queries is standard:

Normally optimizer picks the best execution plan, an optimal order of tables to be joined. In case the optimizer is not producing a good execution plan you can control the order of execution using the HINTS feature SQL. For more information see the Oracle Database Lite SQL Reference.

-- Oracle® Database Lite Developer's Guide

like image 200
Tony Andrews Avatar answered Sep 28 '22 01:09

Tony Andrews


That depends on the WHERE statement.

SELECT /++ordered++/ d.NAME, e.NAME FROM DEPT d, EMP e WHERE d.MGR = e.SS#

Will select all managers for each department. As there are 10 departments, this results in 10 records being fetched.

SELECT /++ordered++/ d.NAME, e.NAME FROM EMP e, DEPT d

This will select all employees with the name of the department they're working in. As there are 1000 employees, your resultset will have 1000 rows.

A JOIN will never cause your engine to loop over m x n rows, you're resultset of an inner join will always be m if m < n

like image 30
thomaux Avatar answered Sep 27 '22 23:09

thomaux