(I apologize in advance for the awful explanation, but if you run the queries below you should see what I mean!)
Why does MSSQL evaluate statements in the true
section of an if exists
construct, even if the if exists
returns false, causing errors?
For example, in the two queries below, the first checks if a table exists (which it does) and also checks if that table has certain columns. For some reason, running this query throws the following errors because the table exists but the columns don't.
Msg 207, Level 16, State 1, Line 21
Invalid column name 'colB'.
Msg 207, Level 16, State 1, Line 21
Invalid column name 'colC'.
Msg 207, Level 16, State 1, Line 21
Invalid column name 'colA'.
The behavior I expected here was for SQL to just move onto the falsepart
of the construct, without throwing errors. (As it does with the next query).
However, the second script (which is identical, bar table names) executes successfully. This is because the table the query is searching for does not exist.
--Scripts to setup the example.
CREATE DATABASE TEST
GO
USE TEST
GO
CREATE TABLE t1 (colD VARCHAR(255)) --Create a table with the correct name, but incorrect column names.
GO
--This query fails, because t1 exists, even though the columns in t1 don't.
IF EXISTS (select * from INFORMATION_SCHEMA.COLUMNS WHERE TABLE_NAME = 't1' AND COLUMN_NAME IN ('colA','colB','colC'))
BEGIN
SELECT colA FROM t1 WHERE colB = 0 AND colC = 1
END
ELSE BEGIN
SELECT 'FALSE'
END
GO
--This query executes ok, because t2 does not exist.
IF EXISTS (select * from INFORMATION_SCHEMA.COLUMNS WHERE TABLE_NAME = 't2' AND COLUMN_NAME IN ('colA','colB','colC'))
BEGIN
SELECT colA FROM t2 WHERE colB = 0 AND colC = 1
END
ELSE BEGIN
SELECT 'FALSE'
END
Is anybody able to explain to me why the first query errors, when the second query runs fine?
So far, I've only managed to test this in Microsoft SQL Server 2012.
The SQL EXISTS Operator The EXISTS operator is used to test for the existence of any record in a subquery. The EXISTS operator returns TRUE if the subquery returns one or more records.
If EXISTS (subquery) returns at least 1 row, the result is TRUE. If EXISTS (subquery) returns no rows, the result is FALSE. If NOT EXISTS (subquery) returns at least 1 row, the result is FALSE. If NOT EXISTS (subquery) returns no rows, the result is TRUE.
The SQL NOT EXISTS Operator will perform quite opposite to the EXISTS Operator. It is helpful in restricting the number of rows returned by the SELECT Statement. The NOT EXISTS in SQL Server will check the Subquery for rows existence. If there are no rows then it will return TRUE, otherwise FALSE.
The result of EXISTS is a boolean value True or False. It can be used in a SELECT, UPDATE, INSERT or DELETE statement. Syntax: SELECT column_name(s) FROM table_name WHERE EXISTS (SELECT column_name(s) FROM table_name WHERE condition);
To answer the first part of this question. Assuming familiarity with a language (such as C#) which has some form of runtime type inspection (e.g. Reflection).
Assume you have code like this:
SomeType t = GetSomeTypeFromSomewhere();
if(t.GetType().GetMethod("FunTimes")!=null)
{
t.FunTimes();
}
And assume that SomeType
doesn't contain a public method called FunTimes
. Even though I've written a guard around trying to invoke the FunTimes
method, I get an error. And, specifically, I get a compile time error - the C# compiler cannot even generate the code, let alone get close to running the code, obtaining the result from GetMethod()
and deciding not to run the code within the nested block.
To switch back to your code, the exact same type of analysis applies here:
IF EXISTS (select * from INFORMATION_SCHEMA.COLUMNS WHERE TABLE_NAME = 't1' AND COLUMN_NAME IN ('colA','colB','colC'))
BEGIN
SELECT colA FROM t1 WHERE colB = 0 AND colC = 1
END
ELSE BEGIN
SELECT 'FALSE'
END
SQL Server tries to compile this batch and fails. It never executes the code, so it never gets to the point of deciding which branch (IF
or ELSE
) to take.
So, if all of the above is true, why then does the second piece of code work? That's because of an particular feature of T-SQL called Deferred Name Resolution. Basically, there's a special rule that applies when the object that's missing is a table (or view, since the two are indistinguishable until the object can be found). In that specific instance, SQL Server will not immediately signal a compilation error.
Under deferred name resolution, execution will start and, if something causes schema changes (such as by adding the missing table/view), this causes the system to recompile the remainder of the code.
I think you are evaluating the results wrong (AND it is not your fault IMHO).
EXISTS part returns FALSE in both cases. However, the SQL query parser is funny, it parses the inside expressions and gives the error before execution of the statements, only if column(s) is missing, it doesn't give an error if the table is missing.
In your first query where it seems to be evaluating to TRUE, try changing table name to something like t2 and you would see it runs and evaluates to FALSE in both.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With