I would expect these two SELECT
s to have the same execution plan and performance. Since there is a leading wildcard on the LIKE
, I expect an index scan. When I run this and look at the plans, the first SELECT
behaves as expected (with a scan). But the second SELECT
plan shows an index seek, and runs 20 times faster.
Code:
-- Uses index scan, as expected:
SELECT 1
FROM AccountAction
WHERE AccountNumber LIKE '%441025586401'
-- Uses index seek somehow, and runs much faster:
declare @empty VARCHAR(30) = ''
SELECT 1
FROM AccountAction
WHERE AccountNumber LIKE '%441025586401' + @empty
Question:
How does SQL Server use an index seek when the pattern starts with a wildcard?
Bonus question:
Why does concatenating an empty string change/improve the execution plan?
Details:
Accounts.AccountNumber
Accounts.AccountNumber
column is a nullable varchar(30)
Table and index definitions:
CREATE TABLE [updatable].[AccountAction](
[ID] [int] IDENTITY(1,1) NOT NULL,
[AccountNumber] [varchar](30) NULL,
[Utility] [varchar](9) NOT NULL,
[SomeData1] [varchar](10) NOT NULL,
[SomeData2] [varchar](200) NULL,
[SomeData3] [money] NULL,
--...
[Created] [datetime] NULL,
CONSTRAINT [PK_Account] PRIMARY KEY NONCLUSTERED
(
[ID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
CREATE NONCLUSTERED INDEX [IX_updatable_AccountAction_AccountNumber_UtilityCode_ActionTypeCd] ON [updatable].[AccountAction]
(
[AccountNumber] ASC,
[Utility] ASC
)
INCLUDE ([SomeData1], [SomeData2], [SomeData3]) WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
CREATE CLUSTERED INDEX [CIX_Account] ON [updatable].[AccountAction]
(
[Created] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
NOTE: Here is the actual execution plan for the two queries. The names of the objects differ slightly from the code above because I was trying to keep the question simple.
These tests (database AdventureWorks2008R2) shows what happens:
SET NOCOUNT ON;
SET STATISTICS IO ON;
PRINT 'Test #1';
SELECT p.BusinessEntityID, p.LastName
FROM Person.Person p
WHERE p.LastName LIKE '%be%';
PRINT 'Test #2';
DECLARE @Pattern NVARCHAR(50);
SET @Pattern=N'%be%';
SELECT p.BusinessEntityID, p.LastName
FROM Person.Person p
WHERE p.LastName LIKE @Pattern;
SET STATISTICS IO OFF;
SET NOCOUNT OFF;
Results:
Test #1
Table 'Person'. Scan count 1, logical reads 106
Test #2
Table 'Person'. Scan count 1, logical reads 106
The results from SET STATISTICS IO
shows that LIO are the same.
But the execution plans are quite different:
In the first test, SQL Server uses an Index Scan
explicit but in the second test SQL Server uses an Index Seek
which is an Index Seek - range scan
. In the last case SQL Server uses a Compute Scalar
operator to generate these values
[Expr1005] = Scalar Operator(LikeRangeStart([@Pattern])),
[Expr1006] = Scalar Operator(LikeRangeEnd([@Pattern])),
[Expr1007] = Scalar Operator(LikeRangeInfo([@Pattern]))
and, the Index Seek
operator use an Seek Predicate
(optimized) for a range scan
(LastName > LikeRangeStart AND LastName < LikeRangeEnd
) plus another unoptimized Predicate
(LastName LIKE @pattern
).
How can LIKE '%...' seek on an index?
My answer: it isn't a "real" Index Seek
. It's a Index Seek - range scan
which, in this case, has the same performance like Index Scan
.
Please see, also, the difference between Index Seek
and Index Scan
(similar debate):
So…is it a Seek or a Scan?.
Edit 1: The execution plan for OPTION(RECOMPILE)
(see Aaron's recommendation please) shows, also, an Index Scan
(instead of Index Seek
):
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With