WHERE and JOIN order of operation

Tags:

teradata

My question is similar to this SQL order of operations but with a little twist, so I think it's fair to ask.

I'm using Teradata. And I have 2 tables: table1, table2.

table1 has only an id column.
table2 has the following columns: id, val

I might be wrong but I think these two statements give the same results.

Statement 1.

SELECT table1.id, table2.val
FROM table1
INNER  JOIN table2
ON table1.id = table2.id
WHERE table2.val<100

Statement 2.

SELECT table1.id, table3.val
FROM table1
INNER JOIN (
    SELECT *
    FROM table2
    WHERE val<100
)  table3
ON table1.id=table3.id

My questions is, will the query optimizer be smart enough to
- execute the WHERE clause first then JOIN later in Statement 1
- know that table 3 isn't actually needed in Statement 2

I'm pretty new to SQL, so please educate me if I'm misunderstanding anything.

541

asked Oct 18 '10 15:10

2 Answers

this would depend on many many things (table size, index, key distribution, etc), you should just check the execution plan:

you don't say which database, but here are some ways:
MySql EXPLAIN
SQL Server SET SHOWPLAN_ALL (Transact-SQL)
Oracle EXPLAIN PLAN

what is explain in teradata?
Teradata Capture and compare plans faster with Visual Explain and XML plan logging

133

answered Sep 20 '22 05:09

Depending on the availability of statistics and indexes for the tables in question the query rewrite mechanism in the optimizer will may or may not opt to scan Table2 for records where val < 100 before scanning Table1.

In certain situations, based on data demographics, joins, indexing and statistics you may find that the optimizer is not eliminating records in the query plan when you feel that it should. Even if you have a derived table such as the one in your example. You can force the optimizer to process a derived table by simply placing a GROUP BY in your derived table. The optimizer is then obligated to resolve the GROUP BY aggregate before it can consider resolving the join between the two tables in your example.

SELECT table1.id, table3.val
FROM table1
INNER JOIN (
    SELECT table2.id, tabl2.val
    FROM table2
    WHERE val<100
    GROUP BY 1,2
)  table3
ON table1.id=table3.id

This is not to say that your standard approach should be to run with this through out your code. This is typically one of my last resorts when I have a query plan that simply doesn't eliminate extraneous records earlier enough in the plan and results in too much data being scanned and carried around through the various SPOOL files. This is simply a technique you can put in your toolkit to when you encounter such a situation.

The query rewrite mechanism is continually being updated from one release to the next and the details about how it works can be found in the SQL Transaction Processing Manual for Teradata 13.0.

answered Sep 19 '22 05:09

Rob Paller

Related questions
                            
                                composite identifier, but uses an ID generator other than manually assigning + Symfony2
                            
                                How to ALTER a view in PostgreSQL
                            
                                Database design for OpenID, Oauth: Twitter and Facebook
                            
                                BatchSqlUpdate - how to get auto generated keys
                            
                                How to force LINQ to SQL to evaluate the whole query in the database?
                            
                                Recursive query used for transitive closure
                            
                                Set max column value as sequence start value with liquibase tags
                            
                                Execute multiple queries in a single statement using postgres and node js
                            
                                Multiple LEFT JOINs - what is the "left" table?
                            
                                How do I trigger Airflow -dag using TriggerDagRunOperator
                            
                                How can I get the actual SQL that caused an SqlException in C#? [duplicate]
                            
                                How can I prevent Hibernate fetching joined entities when I access only the foreign key id?
                            
                                System.Data.IDbCommand and asynchronous execution?
                            
                                How should I organize my master ddl script
                            
                                What are the tradeoffs of reusing a cursor vs. creating a new cursor?
                            
                                Python DB-API: how to handle different paramstyles?
                            
                                SQL query times out when run from C#, fast in SQL Server Management Studio
                            
                                JPA @ElementCollection how can I query?
                            
                                In PHP, how many DB calls per page is okay?
                            
                                What is the maximum length of a string parameter to Stored procedure?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

WHERE and JOIN order of operation

Tags:

sql

teradata

Russell

People also ask

2 Answers

KM.

Rob Paller

Recent Activity

Donate For Us