I have a table (Orders
) with CLUSTERED IDENTITY PRIMARY KEY
(OrderId
) and I am filtering and sorting the data by add date column (AddDate
).
Is there a way to tell the query optimizer that the AddDate
is ordered the same way the OrderId
is (so the data is ordered by AddDate
already)?
SQL Server does not really need to Scan whole table and then sort it. All the operation requires is to Scan the table until end date is found, then filter out data before start date and return it as is (without sorting it).
Example:
SELECT
*
FROM Orders
WHERE AddDate BETWEEN @FromDate AND @ToDate
ORDER BY AddDate
Does the order of columns in a PK index matter? Yes it does. By default, the primary key constraint is enforced in SQL Server by a unique clustered index. The clustered index defines the logical order of rows in the table.
PRIMARY KEY and UNIQUE constraints When you create a PRIMARY KEY constraint, a unique clustered index on the column or columns is automatically created if a clustered index on the table does not already exist and you do not specify a unique nonclustered index. The primary key column cannot allow NULL values.
The problem with clustered indexes in a GUID field are that the GUIDs are random, so when a new record is inserted, a significant portion of the data on disk has to be moved to insert the records into the middle of the table.
Nope, it can be nonclustered. However, if you don't explicitly define it as nonclustered and there is no clustered index on the table, it'll be created as clustered.
Is there a way to tell the query optimizer that the AddDate is ordered the same way the OrderId is (so the data is ordered by AddDate already)?
No, there isn't any way to do that.
However you can order by OrderId
instead of by AddDate
, if AddDate
is ordered the same way as OrderId
, it would return the same results. But unfortunatly, SQL Server would scan the entire table anyway.
Let't take Northwind Orders tables and OrderDate column.
The query:
SELECT *
FROM dbo.Orders
WHERE OrderDate BETWEEN '1997-12-10' AND '1998-03-05'
ORDER BY OrderDate
Produces this plan. It scans the clustered index entirely while applying a filter, then order the result.
The query:
SELECT *
FROM dbo.Orders
WHERE OrderDate BETWEEN '1997-12-10' AND '1997-12-17'
ORDER BY OrderId -- It's equivalent to ordering by OrderDate
Produces this plan. It also scans the clustered index entirely and apply a filter, but it doesn't order.
Having OrderDate
as the clustered index key would drastically improve the performance of the query, but you might not want to have such a clustered index key. However you cloud create a covering index that would also drastically improve the performance:
CREATE INDEX IX_Orders_OrderDate ON dbo.Orders(OrderDate)
INCLUDE ([OrderID], [CustomerID], [EmployeeID], [RequiredDate], [ShippedDate], [ShipVia], [Freight], [ShipName], [ShipAddress], [ShipCity], [ShipRegion], [ShipPostalCode], [ShipCountry])
The query:
SELECT *
FROM dbo.Orders
WHERE OrderDate BETWEEN '1997-12-10' AND '1998-03-05'
ORDER BY OrderDate
Produces this plan. It just seeks the index. It cannot be faster.
But that index is fat, it would penalize data modifications.
However, you could take advantage of a thinner index like the following:
CREATE INDEX IX_Orders_OrderDate ON dbo.Orders(OrderDate, OrderId)
Using a query like this:
DECLARE @FromOrderId int, @ToOrderId int;
SELECT TOP (1) @FromOrderId = OrderId FROM dbo.Orders WHERE OrderDate <= '1997-12-10' ORDER BY OrderDate DESC, OrderId DESC;
SELECT TOP (1) @ToOrderId = OrderId FROM dbo.Orders WHERE OrderDate >= '1998-03-05' ORDER BY OrderDate ASC, OrderId ASC;
SELECT *
FROM dbo.Orders
WHERE
(OrderId >= @FromOrderId OR @FromOrderId IS NULL)
AND (OrderId <= @ToOrderId OR @ToOrderId IS NULL)
ORDER BY OrderID
OPTION (RECOMPILE)
It produces this plan. It just need 3 seeks to solve the query.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With