How to make group by query faster?

Tags:

group-by

I have next table with nearly 2 millions records and of course increasing everyday. Some table records (below columnId are foreign keys to parent tables e.g. DirectionId --> Direction table, ...):

Click to copy

Id         TypeId   DirectionId UserId  IndicatorId Date                                     Size      ExternalId
2003    100        1              1          1              2015-06-01 00:02:23.0000000 11931   28657340
2004    2           1               2          1             2015-06-01 00:03:21.0000000 10358   28657341
2005    2           2               2          1             2015-06-01 00:03:31.0000000 10848   28657342
2006    100        1              2          1             2015-06-01 00:03:52.0000000  7860    28657343
2007    100        1              3          1             2015-06-01 00:03:59.0000000  13353   28657344

I need to get datetime of last message TypeId and DirectionId. Query below returns what I need

Click to copy

select TypeId, DirectionID, max(date) as Date
from message
group by TypeId, DirectionID;

DirectionId TypeId  Date
2               1         2015-06-05 15:12:37.0000000
1               1         2015-06-05 15:12:39.0000000

The problem is that this query takes from 2500 ms to 3000 ms to execute. I added index:

Click to copy

CREATE NONCLUSTERED INDEX [date_index] ON [mqview].[Message] ([Date] ASC)
INCLUDE ([Id],  [TypeId],[DirectionId], [UserId], [Size], [ExternalId]) WITH (PAD_INDEX = OFF, 
STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, 
ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]

What can be done to get results faster?

Update

With proposed added index, I get results faster, but now I would like to get faster results with two inner joins as described above. Or eventually I could do additional 2 queries from tables MessageDirection and MessageType if nothing can be done to increase performance of below query.

Click to copy

SET STATISTICS TIME ON
select  mt.Code, md.Code, max(m.date) as Date
from
  mqview.Message m
  inner join mqview.MessageDirection md on (md.Id = m.DirectionId)
  inner join mqview.MessageType mt on (mt.Id = m.TypeId)
group by mt.Code, md.Code
SET STATISTICS TIME OFF

Message:

Click to copy

 SQL Server Execution Times:
 CPU time = 3343 ms,  elapsed time = 2817 ms.

Execution plan: enter image description here

471

asked Dec 03 '15 10:12

broadband

2 Answers

Your index isn't helpful for the query. You are grouping by TypeId first, whereas your index has the rows ordered by Id first. So to group by TypeId and then DirectionId, the query still has to scan every single row in the table. Then, once it has grouped by these values, it has to look at every row in each group to find the maximum date.

If you had the rows indexed by TypeId, and then by DirectionId, then the grouping would be quicker because the rows would naturally be in order of their grouping within the index. If you then add Date to the index, then the query will know that the last row in each group will be the highest date, which will speed it up a little, but if you make the Date sort in the index descending, then the first row in each group will have the highest date. This means that only the first row in each group needs to be looked at. This will give a great speed boost - you will probably find that with this index your query becomes near-instant.

Because the index now contains all the values in your query, the actual rows of the table will not even need to be visited. The database engine can return the values straight from the index. This removes another step from the query processing and makes it faster again.

Your CREATE INDEX statement would look like this:

Click to copy

CREATE INDEX ix_myNewIndex ON [mqview].[Message] (TypeId, DirectionId, [Date] DESC)

answered Sep 22 '22 15:09

Richard Irons

Click to copy

IF OBJECT_ID('tempdb.dbo.#temp') IS NOT NULL
    DROP TABLE #temp
GO

CREATE TABLE #temp
(
    Id INT PRIMARY KEY,
    TypeId TINYINT,
    DirectionId TINYINT,
    UserId TINYINT,
    IndicatorId TINYINT,
    [Date] DATETIME2
)

CREATE /*UNIQUE*/ NONCLUSTERED INDEX ix ON #temp (TypeId, DirectionId, [Date] DESC) -- DESC
GO

INSERT INTO #temp (Id, TypeId, DirectionId, UserId, IndicatorId, [Date])
VALUES
    (2003, 100, 1, 1, 1, '20150601 00:02:23.0000000'),
    (2004, 2  , 1, 2, 1, '20150601 00:03:21.0000000'),
    (2005, 2  , 2, 2, 1, '20150601 00:03:31.0000000'),
    (2006, 100, 1, 2, 1, '20150601 00:03:52.0000000'),
    (2007, 100, 1, 3, 1, '20150601 00:03:59.0000000')


SELECT TypeId, DirectionID, MAX([Date])
FROM #temp
GROUP BY TypeId, DirectionId

update:

Click to copy

SELECT mt.Code, md.Code, t.[Date]
FROM (
    SELECT TypeId, DirectionID, [Date] = MAX([Date])
    FROM mqview.[Message]
    GROUP BY TypeId, DirectionId
) t
JOIN mqview.MessageDirection md on md.Id = t.DirectionId
JOIN mqview.MessageType mt on mt.Id = t.TypeId

answered Sep 21 '22 15:09

Devart

Related questions
                            
                                Create SQL Server DB from DataSet
                            
                                T-SQL CASE Check for youngest date and then against the other values
                            
                                How can i use stuff function for multiple columns in SQL server?
                            
                                JDBC Delete Works Slow
                            
                                Sql Server: Selective XML Index not being efficiently used
                            
                                Filter SQL query based on scalar function result
                            
                                Select from Table Valued Function nhibernate
                            
                                How can a stored proc have multiple execution plans?
                            
                                SSIS Flat File Source Text Qualifier being ignored
                            
                                Can we totally depend on Database Engine Tuning Advisor?
                            
                                Passing password through database how to make it secure
                            
                                Why am I getting the following error when using database mail -> Procedure sysmail_verify_profile_sp, profile name is not valid?
                            
                                How to "create table as (select * from)"
                            
                                SQL Server Return Null if exists
                            
                                Find all rows where start date is before a prior end date
                            
                                Sequential GUID generation in Java with SQL Server uniqueidentifier
                            
                                TSQL count occurrences in multiple columns
                            
                                How to fast reduce a datatable according to existing entries in a DB
                            
                                Select rows per N and do a SUM
                            
                                Microsoft Server studio 'script table as' for INSERT to bad syntax

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How to make group by query faster?

Tags:

sql-server

group-by

broadband

People also ask

2 Answers

Richard Irons

Devart

Recent Activity

Donate For Us