Why does MAX perform so much worse than TOP on an indexed view?

Question

I'm finding that, on an indexed view with appropriate indexes, MAX(date) is performing an entire index scan followed by a stream aggregate whereas TOP (1) date is optimally using the index and only scanning a single row. For large numbers of rows this is leading to serious performance issues. I've included some code to demonstrate the issue below but would be interested to know if others can explain why this behaviour is occuring (it does not occur on a table with similar index) and whether it is a bug in SQL Server's optimiser (I've tested on both 2008 SP2 and on R2, and both show the same issues).

CREATE TABLE dbo.TableWithDate
(
  id INT IDENTITY(1,1) PRIMARY KEY,
  theDate DATE NOT NULL
);

CREATE NONCLUSTERED INDEX [ix_date] ON dbo.TableWithDate([theDate] DESC);

INSERT INTO dbo.TableWithDate(theDate) VALUES('1 MAR 2010'),('1 MAR 2010'), ('3 JUN 2008');

-- Test 1:  max vs top(1) on the table.  They give same optimal plan (scan one row from the index, since index is in order)
SELECT TOP(1) theDate FROM dbo.TableWithDate ORDER BY theDate DESC;
SELECT MAX(theDate) FROM dbo.TableWithDate;

CREATE TABLE dbo.TheJoinTable
(
  identId INT IDENTITY(1,1) PRIMARY KEY,
  foreignId INT NOT NULL,
  someValue INT NOT NULL
);

CREATE NONCLUSTERED INDEX [ix_foreignValue] ON dbo.TheJoinTable([foreignId] ASC);

INSERT INTO dbo.TheJoinTable(foreignId,someValue) VALUES (1,10),(1,20),(1,30),(2,5),(3,6),(3,10);

GO

CREATE VIEW dbo.TheTablesJoined 
WITH SCHEMABINDING
AS 
  SELECT T2.identId, T1.id, T1.theDate, T2.someValue
  FROM dbo.TableWithDate AS T1
  INNER JOIN dbo.TheJoinTable AS T2 ON T2.foreignId=T1.id
GO

-- Notice the different plans:  the TOP one does a scan of 1 row from each and joins
-- The max one does a scan of the entire index and then does seek operations for each item (less efficient)
SELECT TOP(1) theDate FROM dbo.TheTablesJoined ORDER BY theDate DESC;

SELECT MAX(theDate) FROM dbo.TheTablesJoined;

-- But what about if we put an index on the view?  Does that make a difference?
CREATE UNIQUE CLUSTERED INDEX [ix_clust1] ON dbo.TheTablesJoined([identId] ASC);
CREATE NONCLUSTERED INDEX [ix_dateDesc] ON dbo.TheTablesJoined ([theDate] DESC);

-- No!!!! We are still scanning the entire index (look at the actual number of rows) in the MAX case.
SELECT TOP(1) theDate FROM dbo.TheTablesJoined ORDER BY theDate DESC;

SELECT MAX(theDate) FROM dbo.TheTablesJoined;

arcain · Accepted Answer

John Sansom covered the performance characteristics of MAX vs. TOP, however his results didn't specifically answer your question.

I think the answer lies in the fact that MAX is a general purpose aggregate function geared toward crunching pages and pages of data, where TOP is an operator geared toward only restricting the number of rows being fetched.

In this one narrow use case, both example queries are capable of going after the same thing, and can return the same result. The query using TOP is benefiting from the specific optimizations afforded by using that method for this use case.

I dumped out the XML plans for both queries, and the statement using MAX contained:

<DefinedValues>
  <DefinedValue>
    <ColumnReference Column="Expr1004" />
    <ScalarOperator ScalarString="MAX([db].[dbo].[TheTablesJoined].[theDate])">
      <Aggregate AggType="MAX" Distinct="false">
        <ScalarOperator>
          <Identifier>
            <ColumnReference Database="[db]" Schema="[dbo]" Table="[TheTablesJoined]" Column="theDate" />
          </Identifier>
        </ScalarOperator>
      </Aggregate>
    </ScalarOperator>
  </DefinedValue>
</DefinedValues>

The statement using TOP contained this in place of the XML defining what was being aggregated in the MAX query:

<TopExpression>
  <ScalarOperator ScalarString="(1)">
    <Const ConstValue="(1)" />
  </ScalarOperator>
</TopExpression>

There's a lot less going on in the execution plan when using TOP.

Why does MAX perform so much worse than TOP on an indexed view?

Tags:

sql

tsql

sql-server-2008

Paul McLoughlin

1 Answers

arcain

Recent Activity

Donate For Us

Why does MAX perform so much worse than TOP on an indexed view?

Tags:

sql

tsql

sql-server-2008

Paul McLoughlin

1 Answers

arcain

Related questions

Recent Activity

Donate For Us