Assume a simple table defined as:
CREATE TABLE Table1
(
[ID] [bigint] NOT NULL IDENTITY(1, 1) NOT FOR REPLICATION,
[State] [tinyint] NOT NULL DEFAULT ((0))
)
ALTER TABLE Table1 ADD CONSTRAINT [PK_Table1] PRIMARY KEY CLUSTERED ([ID])
CREATE NONCLUSTERED INDEX [IX_NC_F_Media_StateNotDeleted] ON Table1 ([State]) WHERE ([State]<>(2))
CREATE NONCLUSTERED INDEX [IX_NC_F_Media_State] ON Table1 ([State]) WHERE ([State]=(0))
Values are inserted as follows:
250000 rows with State = 0
1000 rows with State = 5
And the following queries with their respective execution plans:
declare @mID int = 400000;
select State
from Table1
where (ID = @mID and State in (0, 1, 4, 5))
or (@mID IS NULL and State in (0, 4))
Considering that ID
is not null and therefore @mid IS NULL
is mutually exclusive with ID = @mID
, I rewrote the query as:
declare @mID int = 400000;
if @mID is null
begin
select State
from Table1
where State in (0, 4)
end
else
begin
select State
from Table1
where ID=@mID and State in (0, 1, 4, 5)
end
Questions:
@mID
is null, does it really make any difference? The tooltip indicates that the performance hit is pretty much the same and I'm guessing this is due to the data being mostly State = 0
rows.An execution plan is generated when you execute any query which necessarily includes the query along with the plan. Running an execution plan for a particular query provides insights into the SQL Server query optimizer and query engine.
If you can examine and understand execution plans, you can achieve better understanding of the database system and you will write better database code.
Normally, when optimiser generates a plan for the query this plan must be valid for any possible values of parameters. Normally, the plan is cached and it is not re-generated when you run the same query again, so it must remain valid (produce correct result) even if you re-run the query with a different value of the parameter.
So, the plan for the first query must have such shape that would work for any value of the @mID
, including NULL and not-NULL.
Index IX_NC_F_Media_StateNotDeleted
can be used to find values for both parts of the OR
expression, but either optimiser is not smart enough to build a plan that does two seeks of this index and then unions the result, or optimiser decided that such plan would be more expensive.
So, either optimiser is not capable of seeing here that @mid IS NULL
is mutually exclusive with ID = @mID
, or it decides that alternative would be more expensive.
The second query with explicit IF
makes optimiser's choice obvious.
This type of query is called "catch-all" or "sink" query. I recommend to read an excellent article Dynamic Search Conditions in T‑SQL by Erland Sommarskog.
In many cases it is appropriate to add an OPTION (RECOMPILE)
to your first query, like this:
declare @mID int = 400000;
select State
from Table1
where (ID = @mID and State in (0, 1, 4, 5))
or (@mID IS NULL and State in (0, 4))
OPTION(RECOMPILE);
SET @mID = NULL;
select State
from Table1
where (ID = @mID and State in (0, 1, 4, 5))
or (@mID IS NULL and State in (0, 4))
OPTION(RECOMPILE);
Try to run these queries and examine their actual execution plans. You should see that the plan's shape changes according to the actual value of the parameter at the time of execution.
With OPTION(RECOMPILE)
optimiser knows that the generated plan will not be cached as usual, so it takes the actual values of the parameters and inlines them as constants into the query. Once they are constants, here, the optimiser is able to see that NULL IS NULL
is always true (or 400000 IS NULL
is always false) and collapses/simplifies the logic expression. Besides, optimiser is able to pick the best appropriate indexes in each case.
Although the second case performs a seek vs. a scan for the case when @mID is null, does it really make any difference? The tooltip indicates that the performance hit is pretty much the same and I'm guessing this is due to the data being mostly State = 0 rows.
In this case seeking an index is pretty much the same as scanning the whole table. Their sizes (in pages) are the same. If your table had a lot of rows with State=2
, then filtered index would be more efficient, since it would contain less pages than the main table.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With