I have a query containing three inner join statements in the Where clause. The query takes roughly 2 minutes to execute. If I simply change the order of two of the inner joins, performance drops to 40 seconds.
How can doing nothing but changing the order of the inner joins have such a drastic impact of query performance? I would have thought the optimizer would figure all this out.
The order doesn't matter for INNER joins. As long as you change your selects from SELECT * to SELECT a.
The join order is the order in which the tables are joined together in a multi-table SQL statement. Ideally, a plan should start with the join that eliminates the most data to minimize the amount of data carried into the subsequent joins.
JOIN order doesn't matter, the query engine will reorganize their order based on statistics for indexes and other stuff.
While it makes no difference (performance wise) to MySQL which order you put the tables in with INNER JOIN (MySQL treats them as equal and will optimize them the same way), it's convention to put the table that you are applying the WHERE clause to first.
SQL is declarative, that is, the JOIN order should not matter.
However it can in practice, say, if it's a complex query when the optimiser does not explore all options (which in theory could take months).
Another option is that it's a very different query if you reorder and you get different results, but this is usually with OUTER JOINs.
And it could also be the way the ON clause is specified It has to change if you reorder the FROM clause. Unless you are using the older (and bad) JOIN-in-the-WHERE-clause.
Finally, if it's a concern you could use parenthesis to change evaluation order to make your intentions clear, say, filter on a large table first to generate a derived table.
Because by changing the order of the joins, SQL Server is coming up with a different execution plan for your query (chances are it's changing the way it's filtering the tables based on your joins).
In this case, I'm guessing you have several large tables...one of which performs the majority of the filtering.
In one query, your joins are joining several of the large tables together and then filtering the records at the end.
In the other, you are filtering the first table down to a much smaller sub-set of the data...and then joining the rest of the tables in. Since that initial table got filtered before joining the other large recordsets, performance is much better.
You could always verify but running the query with the 'Show query plan' option enabled and see what the query plan is for the two different join orders.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With