SQL Group By Year, Month, Week, Day, Hour SQL vs Procedural Performance

Tags:

I need to write a query that will group a large number of records by periods of time from Year to Hour.

My initial approach has been to decide the periods procedurally in C#, iterate through each and run the SQL to get the data for that period, building up the dataset as I go.

SELECT Sum(someValues) FROM table1 WHERE deliveryDate BETWEEN @fromDate AND @ toDate

I've subsequently discovered I can group the records using Year(), Month() Day(), and datepart(week, date) and datepart(hh, date).

SELECT Sum(someValues) FROM table1 GROUP BY Year(deliveryDate), Month(deliveryDate), Day(deliveryDate)

My concern is that using datepart in a group by will lead to worse performance than running the query multiple times for a set period of time due to not being able to use the index on the datetime field as efficiently; any thoughts as to whether this is true?

Thanks.

605

asked Jan 27 '09 10:01

RSlaughter

2 Answers

As with anything performance related Measure

Checking the query plan up for the second approach will tell you any obvious problems in advance (a full table scan when you know one is not needed) but there is no substitute for measuring. In SQL performance testing that measurement should be done with appropriate sizes of test data.

Since this is a complex case, you are not simply comparing two different ways to do a single query but comparing a single query approach against a iterative one, aspects of your environment may play a major role in the actual performance.

Specifically

the 'distance' between your application and the database as the latency of each call will be wasted time compared to the one big query approach
Whether you are using prepared statements or not (causing additional parsing effort for the database engine on each query)
whether the construction of the ranges queries itself is costly (heavily influenced by 2)

186

answered Sep 21 '22 11:09

ShuggyCoUk

If you put a formula into the field part of a comparison, you get a table scan.

The index is on field, not on datepart(field), so ALL fields must be calculated - so I think your hunch is right.

answered Sep 23 '22 11:09

Galwegian

Related questions
                            
                                What is the difference between static and dynamic binding?
                            
                                C/C++ call-graph utility for Windows platform [closed]
                            
                                When using boost::program_options, how does one set the name of the argument?
                            
                                How to measure mutex contention?
                            
                                Does 64-bit Windows use KERNEL64?
                            
                                PHP Type Hinting: array supported, object NOT?
                            
                                Multithreading Puzzles [closed]
                            
                                Create a styled Dropdown like on jquery UI
                            
                                Google Maps API v3: Group markers?
                            
                                Prevent onPause from trashing OpenGL Context
                            
                                How to find the memory used by any object
                            
                                Python : How to plot 3d graphs using Python?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With