I have a table with a lot of data and I need to join it with some other large tables.
Only a small portion of my table is actually relevant for me each time.
When is it best to filter my data?
In the where clause of the SQL.
Create a temp table with the specific data and only then join it.
Add the predicate to the first inner join ON clause.
Some other idea.
1.
Select *
From RealyBigTable
Inner Join AnotherBigTable On …
Inner Join YetAnotherBigTable On …
Where RealyBigTable.Type = ?
2.
Select *
Into #temp
From RealyBigTable
Where RealyBigTable.Type = ?
Select *
From #temp
Inner Join AnotherBigTable On …
Inner Join YetAnotherBigTable On …
3.
Select *
From RealyBigTable
Inner Join AnotherBigTable On RealyBigTable.type = ? And …
Inner Join YetAnotherBigTable On …
Another question:
What happens first? Join
or Where
?
Because you are using INNER JOINs the WHERE or JOIN debate only depends on your taste and style. Personally, I like to keep the links between the two tables (e.g. foreign key constraint) in the ON clause, and actual filters against data in the WHERE clause.
SQL Server will parse the query into the same token tree, and will therefore build identical query execution plans.
If you were using [LEFT/RIGHT] OUTER JOINS instead, it makes a world of difference since not only is the performance probably different, but also very likely the results.
When is it best to filter my data?
In the WHERE or ON clause, both are seen as the same. For 3, the "first inner join" has no relevance. In a multi-table INNER JOIN scenario, it really doesn't matter which goes first (in the query), as the query optimizer will shuffle the order as it sees fit.
Using a temp table is completely unnecessary and won't help, because you are having to extract the relevant portion anyway - which is what a JOIN would do as well. Moreover, if you had a good index on the JOIN conditions/WHERE filter, the index will be used to only visit the relevant data without looking at the rest of the table(s).
You should put your query in the management studio, tick "include actual execution plan", and run it. That way you will get the exact answer what SQL server did with your query. From then, you can move forward with optimization.
In general:
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With