Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Checking variable for NULL kills performance

I have following queries:

DECLARE @application_number CHAR(8)= '37832904';
SELECT
    la.LEASE_NUMBER AS lease_number,
    la.[LEASE_APPLICATION] AS application_number,
    tnu.[FOLLOWUP_CODE] AS note_type_code -- catch codes not in codes table
FROM [dbo].[lease_applications] la
LEFT JOIN [dbo].tickler_notes_uniq tnu ON tnu.[ACCOUNT_NUMBER] = la.[ACCOUNT_NUMBER]
WHERE la.LEASE_APPLICATION = @application_number
      OR @application_number IS NULL;



SELECT
    la.LEASE_NUMBER AS lease_number,
    la.[LEASE_APPLICATION] AS application_number,
    tnu.[FOLLOWUP_CODE] AS note_type_code -- catch codes not in codes table
FROM [dbo].[lease_applications] la
LEFT JOIN [dbo].tickler_notes_uniq tnu ON tnu.[ACCOUNT_NUMBER] = la.[ACCOUNT_NUMBER]
WHERE la.LEASE_APPLICATION = @application_number; 

The only difference between these 2 queries is that I've added checking for the variable if it is NULL or not.

The execution plans of these queries are: enter image description here

You can find graphical plan here

So the question is. Why the plans are so different?

UPDATE:

The actual execution plan of the first query can be found here

OPTION(RECOMPILE) changed the actual execution plan to the good one. However the downside of that is that my main goal was to create the TVF with these params and then everybody who uses that function is supposed to provide that option.

It is also worth to mention that my main goal is to create TVF with 2 params. Each of it might be null and might be not but at least 1 of them is supposed to be NOT NULL. These params are more or less equal, they are just different keys in the 2 tables that would give the same result anyway (the same number of rows and so on). That's why I wanted to do something like

WHERE (col1 = @param1 OR @param1 IS NULL) AND (col2 = @param2 OR @param2 IS NULL) AND (@param1 IS NOT NULL or @param2 IS NOT NULL)

So, basically I am not interested in ALL records at all

like image 581
Dmitrij Kultasev Avatar asked Sep 06 '17 07:09

Dmitrij Kultasev


2 Answers

You have two different plans for two different queries. It makes sense that when you have an equality condition on the WHERE clause(la.LEASE_APPLICATION = @application_number)(and having indexes in place) you get an index seek: working as expected!

On the other hand, when you write both conditions into one WHERE clause (la.LEASE_APPLICATION = @application_number OR @application_number IS NULL) the query optimizer has chosen to do a scan.

Even though the parameter value has been supplied and it is not null, the plan that is being used is the cached one and it can not know at compile time the actual value of your parameter.

This is the case if you have a stored procedure and you are calling it with parameters. This is not the case when executing a simple query using a variable. As @sepupic has stated, variable values do not get sniffed.

The plan is generated to handle both cases: when you have a value for your parameter as well as when you have none.

One option to fix your problem would be using OPTION(RECOMPILE) as it has been stated already in the comments. Another option would be to have your queries separated(for ex. having two different stored procedures, called by a third "wrapper" procedure), so that they get optimized accordingly, each one on it's own.

I would suggest you to take a look at this article by Kimberly L. Tripp: Building High Performance Stored Procedures and this other one by Aaron Bertrand: An Updated "Kitchen Sink" Example. I think these are the best articles explaining these kind of scenarios.

Both articles explain this situation, possible problems with it and possible solutions as well such as option(recompile), dynamic sql or having separated stored procedures.

Good luck!

like image 165
Rigerta Avatar answered Sep 30 '22 07:09

Rigerta


Your queries do not use parameters, they use a variable. The variable is not sniffed at the moment the batch is compiled (compilation = making a plan) because the batch is seen as one whole thing. So server has no idea if the variable is null or is not null. And it must make a plan that will be suitable in both cases.

The first query can filter no rows at all so the scan is selected.

The second query does filter, but the value is unknown, so if you use SQL server 2014 and the fintered column is not unique, the estimation is C^3/4 (C= table cardinality)

The situation can be different if you use RECOMPILE query option. When you add it to your query, it's recompiled AFTER the assignment of table variable is done. In this case the variable value is known, and you'll get another plan. It will be a plan based on column statistics for a known value of your filter

like image 29
sepupic Avatar answered Sep 30 '22 07:09

sepupic