Is there any difference in the performance of the following three SQL statements?
SELECT * FROM tableA WHERE EXISTS (SELECT * FROM tableB WHERE tableA.x = tableB.y) SELECT * FROM tableA WHERE EXISTS (SELECT y FROM tableB WHERE tableA.x = tableB.y) SELECT * FROM tableA WHERE EXISTS (SELECT 1 FROM tableB WHERE tableA.x = tableB.y)
They all should work and return the same result set. But does it matter if the inner SELECT selects all fields of tableB, one field, or just a constant?
Is there any best practice when all statements behave equal?
The EXISTS clause is much faster than IN when the subquery results is very large. Conversely, the IN clause is faster than EXISTS when the subquery results is very small. Also, the IN clause can't compare anything with NULL values, but the EXISTS clause can compare everything with NULLs.
If you do an inner join on a UNIQUE column, they exhibit same performance. If you do an inner join on a recordset with DISTINCT applied (to get rid of the duplicates), EXISTS is usually faster.
If you need to check for existence of values in another table, the EXISTS operator is preferred as it clearly demonstrates the intent of the query. If you need to check against more than one single column, you can only use EXISTS since the IN operator only allows you to check for one single column.
Relational algebra is the most common way of writing a query and also the most natural way to do so. The code is clean, easy to troubleshoot, and unsurprisingly, it is also the most efficient way to join two tables.
The truth about the EXISTS clause is that the SELECT clause is not evaluated in an EXISTS clause - you could try:
SELECT * FROM tableA WHERE EXISTS (SELECT 1/0 FROM tableB WHERE tableA.x = tableB.y)
...and should expect a divide by zero error, but you won't because it's not evaluated. This is why my habit is to specify NULL in an EXISTS to demonstrate that the SELECT can be ignored:
SELECT * FROM tableA WHERE EXISTS (SELECT NULL FROM tableB WHERE tableA.x = tableB.y)
All that matters in an EXISTS clause is the FROM and beyond clauses - WHERE, GROUP BY, HAVING, etc.
This question wasn't marked with a database in mind, and it should be because vendors handle things differently -- so test, and check the explain/execution plans to confirm. It is possible that behavior changes between versions...
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With