SQL max() function with a where clause and group by does not use the index efficiently

Tags:

I have a table MYTABLE that has approximately 25 columns, with two of them being USERID (integer) and USERDATETIME (dateTime).

I have an index over this table on these two columns, with USERID being the first column followed by USERDATETIME.

I would like to get the maximum USERDATETIME for each USERID. So:

select USERID,MAX(USERDATETIME) 
from MYTABLE WHERE USERDATETIME < '2015-10-11'
GROUP BY USERID

I would have expected the optimizer to be able to find each unique USERID and maximum USERDATETIME with the number of seeks equal to the number of unique USERIDs. And I would expect this to be reasonable fast. I have 2000 userids and 6 million rows in myTable. However, the actual plan shows 6 million rows from an index scan. If I use an index with USERDATETIME/USERID, the plan changes to use an index seek, but still 6 million rows.

Why does SQL not use the index in a way that would reduce the number of rows processed?

470

asked Dec 14 '15 18:12

Mike

1 Answers

If you are using SQL Server this is not an optimisation generally carried out by the product (except in limited cases where the table is partitioned by that value).

However you can do it manually using the technique from here

CREATE TABLE YourTable
  (
     USERID       INT,
     USERDATETIME DATETIME,
     OtherColumns CHAR(10)
  )

CREATE CLUSTERED INDEX IX
  ON YourTable(USERID ASC, USERDATETIME ASC);

WITH R
     AS (SELECT TOP 1 USERID,
                      USERDATETIME
         FROM   YourTable
         ORDER  BY USERID DESC,
                   USERDATETIME DESC
         UNION ALL
         SELECT SubQuery.USERID,
                SubQuery.USERDATETIME
         FROM   (SELECT T.USERID,
                        T.USERDATETIME,
                        rn = ROW_NUMBER()
                               OVER (
                                 ORDER BY T.USERID DESC, T.USERDATETIME DESC)
                 FROM   R
                        JOIN YourTable T
                          ON T.USERID < R.USERID) AS SubQuery
         WHERE  SubQuery.rn = 1)
SELECT *
FROM   R

enter image description here

If you have another table with the UserIds it is possible to get an efficient plan more easily with

SELECT U.USERID,
       CA.USERDATETIME
FROM   Users U
       CROSS APPLY (SELECT TOP 1 USERDATETIME
                    FROM   YourTable Y
                    WHERE  Y.USERID = U.USERID
                    ORDER  BY USERDATETIME DESC) CA

enter image description here

193

answered Nov 15 '22 01:11

Martin Smith

Related questions
                            
                                Count number of group by rows in hibernate with criteria
                            
                                Oracle strange SUM behaviour
                            
                                Retrieving data from 6pm previous day till 6pm present day
                            
                                How to reorder records / rows in a MySQL table
                            
                                Programmatically rename a table and all its references in SQL Server?
                            
                                Update should not be done when the checkbox is disabled
                            
                                apache ignite query
                            
                                Parse SQL Script to extract table and column names
                            
                                Does order matter in MySQL for short circuiting of predicates?
                            
                                need to tune this sql query to improve performance
                            
                                How to make a select statement return null if no value is returned?
                            
                                MySQL ORDER BY random field does not work with additional operation
                            
                                Why "SELECT COUNT(DISTINCT <Column>) FROM <Table>" return 0?
                            
                                How to make a UNION with Doctrine?
                            
                                Get the top n results per group [duplicate]
                            
                                How can I make this query sargable?
                            
                                EntityFramework package version="6.1.3" and Web config version 6.0.0.0?
                            
                                MySQL grouping isn't respecting ORDER BY
                            
                                Combine query that relies on resultset of another
                            
                                @SqlResultSetMapping columns : entities with sub-entities

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

SQL max() function with a where clause and group by does not use the index efficiently

Tags:

sql

sql-server

indexing

max

Mike

People also ask

1 Answers

Martin Smith

Recent Activity

Donate For Us