Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

SQL performance on LEFT OUTER JOIN vs NOT EXISTS

Tags:

sql

sql-server

If I want to find a set of entries in table A but not in table B, I can use either LEFT OUTER JOIN or NOT EXISTS. I've heard SQL Server is geared towards ANSI and in some case LEFT OUTER JOINs are far more efficient than NOT EXISTS. Will ANSI JOIN perform better in this case? and are join operators more efficient than NOT EXISTS in general on SQL Server?

like image 422
kefeizhou Avatar asked Jul 21 '11 14:07

kefeizhou


People also ask

What is faster Left join or not exists?

Many years ago (SQL Server 6.0 ish), LEFT JOIN was quicker, but that hasn't been the case for a very long time. These days, NOT EXISTS is marginally faster. The biggest impact in Access is that the JOIN method has to complete the join before filtering it, constructing the joined set in memory.

Which is faster exists or join?

In cases like above the Exists statement works faster than that of Joins. Exists will give you a single record and will save the time also. In case of joins the number of records will be more and all the records must be used.

Does Outer join affect performance?

Outer join will normally give you MANY more results ( A*B instead of WHERE A=B ). That'll take more time. An outer join will not normally give you A multiplied by B as the number of results as it isn't a union.

Is Left join more efficient?

There is not a "better" or a "worse" join type. They have different meaning and they must be used depending on it. In your case, you probably do not have employees with no work_log (no rows in that table), so LEFT JOIN and JOIN will be equivalent in results.


2 Answers

Joe's link is a good starting point. Quassnoi covers this too.

In general, if your fields are properly indexed, OR if you expect to filter out more records (i.e. have a lots of rows EXIST in the subquery) NOT EXISTS will perform better.

EXISTS and NOT EXISTS both short circuit - as soon as a record matches the criteria it's either included or filtered out and the optimizer moves on to the next record.

LEFT JOIN will join ALL RECORDS regardless of whether they match or not, then filter out all non-matching records. If your tables are large and/or you have multiple JOIN criteria, this can be very very resource intensive.

I normally try to use NOT EXISTS and EXISTS where possible. For SQL Server, IN and NOT IN are semantically equivalent and may be easier to write. These are among the only operators you will find in SQL Server that are guaranteed to short circuit.

like image 76
JNK Avatar answered Sep 28 '22 03:09

JNK


Personally, I think that this one gets a big old, "It Depends". I've seen instances where each method has outperformed the other.

Your best bet is to test both and see which performs better. If it's a situation where the tables will always be small and performance isn't as crucial then I'd just go with whichever is the clearest to you (that's usually NOT EXISTS for most people) and move on.

like image 45
Tom H Avatar answered Sep 28 '22 04:09

Tom H