Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Need SARGABLE way to filter records and also specify a default value for NULLs

I am creating an adaptor to custom data from a client. I cannot change their schema or modify the values in their tables, though I can suggest new indices. The approach is to use CTEs to join and reformat the custom data to use our column names, enumerated values, etc. Once the data is reformatted, our standard CTEs can be appended, and a query forged from it that can perform our standard analyses.

Some values resulting from the reformatting are NULL due to LEFT JOINs that had no match, or due to values in their data that are actually NULL.

My task is to substitute default values for NULLs in many fields, and also allow WHERE clauses to be inserted into the query. Currently, ISNULL calls or CASE statements are used to handle the default values. And currently, by the time the WHERE condition is hit, this substitution has already been performed, so that the end user, who has access to our query builder, can filter on a value which might be the default value. If the filter value is the default value, then records with NULL values that were replaced with the default should be selected.

The problem is that if I have myField = ISNULL(myField, 'MyDefault') as my reformatting formula, and later have WHERE myField = 'MyDefault' in an outer layer of the onion (a later CTE), that this where clause is not sargable: the query optimizer does not choose my index on myField.

A partial solution that occurs to me is to not do any NULL replacement in my inner CTEs, then have a CTE that gets the WHERE clauses inserted, then have an outer CTE that performs all the NULL replacements. Such a query could use the indices. (I have verified this.) However, the where clauses could no longer expect that a test of the value against the default value will also pick up the records with NULL values, since that substitution would not yet have occurred.

Is there a way to perform null substitution, allow SARGABLE where filters, and filter on NULL values as though they held the default value?

NOTE on problem size: A typical example involves JOINing a 6 million record table to a 7 million record table with a many-to-many relationship that creates 12 million records. When the filter is SARGABLE, the query takes about 10 seconds. When it is not SARGABLE, it takes over 10 minutes on one machine, and over three minutes on a faster machine.

COMMENT ON CHOSEN SOLUTION:

The clever use of intersection to allow comparison of a field to either a NULL or a non-NULL without ISNULL or other non-sargable functions can be instrumented into our code with the fewest changes to our legacy queries.

COMMENT 2: Missing case

There are these six cases:

  1. Selected value is not null and does not equal the default and does not match the filter value. Should exclude.
  2. Selected value is not null and does not equal the default and DOES match the filter value. Should include.
  3. Selected value is not null and DOES equal the default value and does not match the filter value. Should exclude.
  4. Selected value is not null and DOES equal the default value and DOES match the filter value. Should include.
  5. Selected value is null and the filter value is not the default. Should exclude.
  6. Selected value is null and the filter value is the default. Should include.

Case 4 does not work using the offered solution. The selected field is not null, so the first half of the intersection has a record with anon-null value. But in the second half of the intersection, the NULLIF statement has created a record with a null value. The intersection produces zero records. The record is rejected. I am still looking for a solution that handles this case. So close...

Update Solution:

I have a fix. Say that I am fitering on [County Name] and my default value is 'Unknown'...

where EXISTS (
    select [County Name] 
    intersect 
    (select NULLIF('User selected county name', 'Unknown') union select 'User selected county name')
)
like image 342
Paul Chernoch Avatar asked Jun 21 '13 17:06

Paul Chernoch


2 Answers

It looks like you already are building your query dynamically so when you get a value from your tool that needs to be filtered on you could build a query with a where clause that looks something like this.

SQL Fiddle

MS SQL Server 2008 Schema Setup:

create table YourTable
(
  ID int identity primary key,
  Name varchar(20)
)

create index IX_YourTable_Name on YourTable(Name)

insert into YourTable values
('Name1'),
('Name2'),
(null)

Query 1:

declare @Param varchar(20)
set @Param = 'DefaultName'

select ID,
       coalesce(Name, 'DefaultName') as Name
from YourTable
where exists(select Name intersect select nullif(@Param, 'DefaultName'))

Results:

| ID |        NAME |
--------------------
|  3 | DefaultName |

Query 2:

declare @Param varchar(20)
set @Param = 'Name1'

select ID,
       coalesce(Name, 'DefaultName') as Name
from YourTable
where exists(select Name intersect select nullif(@Param, 'DefaultName'))

Results:

| ID |  NAME |
--------------
|  1 | Name1 |

The query plan for the query above will use IX_YourTable_Name for a seek.

enter image description here

Ref: Undocumented Query Plans: Equality Comparisons

like image 164
Mikael Eriksson Avatar answered Nov 09 '22 11:11

Mikael Eriksson


You said you can't change the schema, but I'm thinking outside the box here. You could add a new database that has views that look into the existing database. For example:

use NewViewDb
GO

CREATE VIEW dbo.[T1T2View]
AS
SELECT field1, field2, COALESCE(field3, 'default value'), ...
FROM RealDb.dbo.Table1 t1 LEFT JOIN RealDb.dbo.Table2 t2 
ON t1.Id = t2.Id

GO
like image 33
chue x Avatar answered Nov 09 '22 12:11

chue x