I have a relatively simple query joining two tables. The "Where" criteria can be expressed either in the join criteria or as a where clause. I'm wondering which is more efficient.
Query is to find max sales for a salesman from the beginning of time until they were promoted.
Case 1
select salesman.salesmanid, max(sales.quantity) from salesman inner join sales on salesman.salesmanid =sales.salesmanid and sales.salesdate < salesman.promotiondate group by salesman.salesmanid
Case 2
select salesman.salesmanid, max(sales.quantity) from salesman inner join sales on salesman.salesmanid =sales.salesmanid where sales.salesdate < salesman.promotiondate group by salesman.salesmanid
Note Case 1 lacks a where clause altogether
RDBMS is Sql Server 2005
EDIT If the second piece of the join criteria or the where clause was sales.salesdate < some fixed date so its not actually any criteria of joining the two tables does that change the answer.
In terms of readability though, especially in complex queries that have multiple joins, it is easier to spot join conditions when they are placed in the ON clause and filter conditions when they are placed in the WHERE clause.
“Is there a performance difference between putting the JOIN conditions in the ON clause or the WHERE clause in MySQL?” No, there's no difference. The following queries are algebraically equivalent inside MySQL and will have the same execution plan.
TLDR: The most efficient join is also the simplest join, 'Relational Algebra'. If you wish to find out more on all the methods of joins, read further. Relational algebra is the most common way of writing a query and also the most natural way to do so.
The advantage of a join includes that it executes faster. The retrieval time of the query using joins almost always will be faster than that of a subquery. By using joins, you can maximize the calculation burden on the database i.e., instead of multiple queries using one join query.
I wouldn't use performance as the deciding factor here - and quite honestly, I don't think there's any measurable performance difference between those two cases, really.
I would always use case #2 - why? Because in my opinion, you should only put the actual criteria that establish the JOIN between the two tables into the JOIN clause - everything else belongs in the WHERE clause.
Just a matter of keeping things clean and put things where they belong, IMO.
Obviously, there are cases with LEFT OUTER JOINs where the placement of the criteria does make a difference in terms of what results get returned - those cases would be excluded from my recommendation, of course.
Marc
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With