Avoid sort operator in index plan

Tags:

I have two tables [LogTable] and [LogTable_Cross].

Below is the schema and script to populate them:

 --Main Table

 CREATE TABLE [dbo].[LogTable]
    (
      [LogID] [int] NOT NULL
                    IDENTITY(1, 1) ,
      [DateSent] [datetime] NULL,
    )
 ON [PRIMARY]
GO
 ALTER TABLE [dbo].[LogTable] ADD CONSTRAINT [PK_LogTable] PRIMARY KEY CLUSTERED  ([LogID]) ON [PRIMARY]
GO
 CREATE NONCLUSTERED INDEX [IX_LogTable_DateSent] ON [dbo].[LogTable] ([DateSent] DESC) ON [PRIMARY]
GO
 CREATE NONCLUSTERED INDEX [IX_LogTable_DateSent_LogID] ON [dbo].[LogTable] ([DateSent] DESC) INCLUDE ([LogID]) ON [PRIMARY]
GO


--Cross table

 CREATE TABLE [dbo].[LogTable_Cross]
    (
      [LogID] [int] NOT NULL ,
      [UserID] [int] NOT NULL
    )
 ON [PRIMARY]
GO
 ALTER TABLE [dbo].[LogTable_Cross] WITH NOCHECK ADD CONSTRAINT [FK_LogTable_Cross_LogTable] FOREIGN KEY ([LogID]) REFERENCES [dbo].[LogTable] ([LogID])
GO
 CREATE NONCLUSTERED INDEX [IX_LogTable_Cross_UserID_LogID]
 ON [dbo].[LogTable_Cross] ([UserID])
 INCLUDE ([LogID])
GO


-- Script to populate them
 INSERT INTO [LogTable]
        SELECT TOP 100000
                DATEADD(day, ( ABS(CHECKSUM(NEWID())) % 65530 ), 0)
        FROM    sys.sysobjects
                CROSS JOIN sys.all_columns


 INSERT INTO [LogTable_Cross]
        SELECT  [LogID] ,
                1
        FROM    [LogTable]
        ORDER BY NEWID()

 INSERT INTO [LogTable_Cross]
        SELECT  [LogID] ,
                2
        FROM    [LogTable]
        ORDER BY NEWID()

 INSERT INTO [LogTable_Cross]
        SELECT  [LogID] ,
                3
        FROM    [LogTable]
        ORDER BY NEWID()


GO

I want to select all those logs (from LogTable) which has given userid (user id will be checked from cross table LogTable_Cross) with datesent desc.

Click to copy

SELECT  DI.LogID              
FROM    LogTable DI              
        INNER JOIN LogTable_Cross DP ON DP.LogID = DI.LogID  
        WHERE  DP.UserID = 1  
ORDER BY DateSent DESC

After running this query here is my execution plan: enter image description here

As you can see there is a sort operator coming in role and that should be probably because of following line "ORDER BY DateSent DESC"

My question is that why that Sort operator is coming in the plan even though I have the following index applied on the table

Click to copy

GO
 CREATE NONCLUSTERED INDEX [IX_LogTable_DateSent] ON [dbo].[LogTable] ([DateSent] DESC) ON [PRIMARY]
GO
 CREATE NONCLUSTERED INDEX [IX_LogTable_DateSent_LogID] ON [dbo].[LogTable] ([DateSent] DESC) INCLUDE ([LogID]) ON [PRIMARY]
GO

On the other hand if I remove the join and write the query in this way:

Click to copy

SELECT  DI.LogID              
FROM    LogTable DI              
  --      INNER JOIN LogTable_Cross DP ON DP.LogID = DI.LogID  
        --WHERE  DP.UserID = 1  
ORDER BY DateSent DESC

the plan changes to

enter image description here

i.e Sort operator is removed and the plan is showing that my query is using my non clustered index.

So is that a way to remove "Sort" operator in the plan for my query even if I am using join.

EDIT:

I went further and limited the "Max Degree of Parallelism" to 1

enter image description here

Ran the following query again:

Click to copy

SELECT  DI.LogID              
FROM    LogTable DI              
        INNER JOIN LogTable_Cross DP ON DP.LogID = DI.LogID  
        WHERE  DP.UserID = 1  
ORDER BY DateSent DESC

and the plan is still having that Sort operator:

enter image description here

Edit 2

Even if I have the following index as suggested:

Click to copy

 CREATE NONCLUSTERED INDEX [IX_LogTable_Cross_UserID_LogID_2]
 ON [dbo].[LogTable_Cross] ([UserID], [LogID])

the plan is still having the Sort operator: enter image description here

472

asked Apr 19 '17 08:04

Raghav

2 Answers

The second query of yours does not contain the UserId condition and therefore it is not an equivalent query. The reason why the first query is not covered by your indexes on LogTable is the fact, that UserId is not present in them (and you need to perform the join as well). Therefore, SQL Server has to join the tables (Hash Join, Merge Join or Nested-Loop join). SQL Server correctly selects the Hash Join, since the intermediate results are large and they are not sorted according to the LogID. If you give them the intermediate result sorted according to the LogID (your second edit) then he uses merge join, however, sort according to the DateSend is stil needed. The only solution without sort is to create an indexed materialized view:

Click to copy

CREATE VIEW vLogTable
WITH SCHEMABINDING
AS
   SELECT  DI.LogID, DI.DateSent, DP.UserID           
   FROM dbo.LogTable DI              
   INNER JOIN dbo.LogTable_Cross DP ON DP.LogID = DI.LogID  

CREATE UNIQUE CLUSTERED INDEX CIX_vCustomerOrders 
   ON dbo.vLogTable(UserID, DateSent, LogID);

The view has to be used with noexpand hint, so the optimizer can find the CIX_vCustomerOrders index:

Click to copy

SELECT  LogID              
FROM dbo.vLogTable   WITH(NOEXPAND)
    WHERE  UserID = 1  
ORDER BY DateSent DESC

This query is equivalent query to your first query. You may check the correctness if you insert the following row:

Click to copy

INSERT INTO LogTable VALUES (CURRENT_TIMESTAMP)

then my query still returns the correct result (10000 rows), however, your second query returns 10001 rows. You may try to delete or insert some other rows and the view will still be up-to-date and you recieve correct results from my query.

171

answered Oct 13 '22 23:10

Radim Bača

You have sort operation when you have the join because of the parallelism in the previous steps. When SQL Server processes the records in multiple threads, the order is not determined anymore. Each thread just pushes the results to the next item in the pipeline (Hash match in your case).

Since the order is not determined and you are asking for an order, SQL Server has to sort the result.

You can try to add the MAXDOP = 1 hint to force SQL Server to run the query using only one thread. This might help in this case, but can cause performance degradation too.

The second query can be satisfied using an index scan and the index is ordered and that order is the same as the requested one. The records (keys) in the index are ordered by definition. SQL Server guessed that running the query on one thread and just reading the data using the index is more beneficial than reading the data using multiple threads and sorting them later.

answered Oct 14 '22 00:10

Pred

Related questions
                            
                                Java: Calling a stored procedure in an oracle database
                            
                                Outer query is running before inner query
                            
                                SQL ranking query to compute ranks and median in sub groups
                            
                                mysql - Get two greatest values from multiple columns
                            
                                Add specific points on highest marks
                            
                                MySQL - how do I write this Query
                            
                                Cannot find a dll for SQL Server from VS 2012
                            
                                ActiveRecord left outer join with and clause
                            
                                PostgreSQL: Terribly slow ORDER BY with primary key as the ordering key
                            
                                How reliable is the cost measurement in PostgreSQL Explain Plan?
                            
                                Implement a ring buffer
                            
                                SQL query not working as expected (from a beginners point of view)
                            
                                distribute value to all rows while updating table
                            
                                PHP and MySQL showing different results with same query
                            
                                Edit specific row of a table using id of that row
                            
                                Cannot get the value from database in PHP
                            
                                General approach for SQL script execution in Java
                            
                                SQL Server - merge two XML using only .modify()
                            
                                How to perform complex API authorization in fewer SQL queries?
                            
                                Using `type` as database column name

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Avoid sort operator in index plan

Tags:

sql

indexing

sql-server-2008

sql-server-2012

database-tuning

Raghav

People also ask

2 Answers

Radim Bača

Pred

Recent Activity

Donate For Us