If I want to find a set of entries in table A but not in table B, I can use either LEFT OUTER JOIN or NOT EXISTS. I've heard SQL Server is geared towards ANSI and in some case LEFT OUTER JOINs are far more efficient than NOT EXISTS. Will ANSI JOIN perform better in this case? and are join operators more efficient than NOT EXISTS in general on SQL Server?
Many years ago (SQL Server 6.0 ish), LEFT JOIN was quicker, but that hasn't been the case for a very long time. These days, NOT EXISTS is marginally faster. The biggest impact in Access is that the JOIN method has to complete the join before filtering it, constructing the joined set in memory.
In cases like above the Exists statement works faster than that of Joins. Exists will give you a single record and will save the time also. In case of joins the number of records will be more and all the records must be used.
Outer join will normally give you MANY more results ( A*B instead of WHERE A=B ). That'll take more time. An outer join will not normally give you A multiplied by B as the number of results as it isn't a union.
There is not a "better" or a "worse" join type. They have different meaning and they must be used depending on it. In your case, you probably do not have employees with no work_log (no rows in that table), so LEFT JOIN and JOIN will be equivalent in results.
Joe's link is a good starting point. Quassnoi covers this too.
In general, if your fields are properly indexed, OR if you expect to filter out more records (i.e. have a lots of rows EXIST
in the subquery) NOT EXISTS
will perform better.
EXISTS
and NOT EXISTS
both short circuit - as soon as a record matches the criteria it's either included or filtered out and the optimizer moves on to the next record.
LEFT JOIN
will join ALL RECORDS regardless of whether they match or not, then filter out all non-matching records. If your tables are large and/or you have multiple JOIN
criteria, this can be very very resource intensive.
I normally try to use NOT EXISTS
and EXISTS
where possible. For SQL Server, IN
and NOT IN
are semantically equivalent and may be easier to write. These are among the only operators you will find in SQL Server that are guaranteed to short circuit.
Personally, I think that this one gets a big old, "It Depends". I've seen instances where each method has outperformed the other.
Your best bet is to test both and see which performs better. If it's a situation where the tables will always be small and performance isn't as crucial then I'd just go with whichever is the clearest to you (that's usually NOT EXISTS
for most people) and move on.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With