Quite a lot of database scripts are of the form:
IF NOT EXISTS(SELECT * FROM Countries WHERE Name = 'France')
INSERT INTO(Countries)
However, I've also seen people do:
IF NOT EXISTS(SELECT CountryID FROM Countries WHERE Name = 'France')
INSERT INTO(Countries)
And even:
IF NOT EXISTS(SELECT 1 FROM Countries WHERE Name = 'France')
INSERT INTO(Countries)
The advantage of the last one is supposedly that its more efficient: the query doesn't actually use any of the columns in the subquery, so it might be quicker to not bring any of them back. But it looks odd, so it strikes me that it might confuse some people. And does it make any difference anyway to the actual execution time?
SQL NOT EXISTS in a subquery In simple words, the subquery with NOT EXISTS checks every row from the outer query, returns TRUE or FALSE, and then sends the value to the outer query to use. In even simpler words, when you use SQL NOT EXISTS, the query returns all the rows that don't satisfy the EXISTS condition.
If you specify the GROUP BY clause, columns referenced must be all the columns in the SELECT clause that do not contain an aggregate function. These columns can either be the column, an expression, or the ordinal number in the column list.
Notes. If SELECT DISTINCT is specified or if the SELECT statement contains a GROUP BY clause, the ORDER BY columns must be in the SELECT list. An ORDER BY clause prevents a SELECT statement from being an updatable cursor.
I think it was back in the 6.5 - 7 period of SQL Server that they made the query optimizer smart enough to know that:
IF NOT EXISTS(SELECT * FROM Countries WHERE Name = 'France')
Does not actually need to return any row data. The advice to use SELECT 1
pre-dates that, yet continues on as a myth.
Arguably, it's a fault with the SQL standard - they ought to allow EXISTS
to start with the FROM
clause and not have a SELECT
portion at all.
And from Subqueries with EXISTS:
The select list of a subquery introduced by EXISTS almost always consists of an asterisk (*). There is no reason to list column names because you are just testing whether rows that meet the conditions specified in the subquery exist.
No, sql server is smart enough to make this optimization for you.
No. The only thing that matters is if rows are returned or not, That's why SELECT 1 is good enough.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With