Speed of MIN/MAX vs GROUP BY for aggregating constant-in-group values

Tags:

In the the table T, it is guaranteed that each value of column A is associated with exactly one value of column B (i.e. that there is a functional dependency A → B). Because of this both of the queries below return the same results. Which one will generally run faster?

Using GROUP BY on A and B

select
   A
   ,B
   ,sum(C) 
from
   T
group by
   A
   ,B

or using MAX/MIN on B?

select
   A
   ,MAX(B)
   ,sum(C) 
from
   T
group by
   A

I do know that the GROUP BY A and B version is better at not concealing data issues where an A arrives that is associated with more than one B, I'm just curious about whether one of the queries is generally more work for a DBMS to execute. If the answer depends entirely on the choice of DBMS and you still have interesting information to share then choose your favourite DBMS and answer only for it.

892

asked Jan 30 '17 15:01

Dzamo Norton

1 Answers

Well I went ahead and ran a test on SQL Server 2016 even though I was interested in uncovering more general, theory-based information. I used four columns in the role of B above to accentuate any differences in run time and submitted a batch containing both types of query above. The execution plans generated by SQL Server were almost identical but the cost reported for the GROUP BY query was 53% of the batch while that of the MAX/MIN query was 47%.

The initial index seek step is identical for both queries. It is followed by hash table building step in which the GROUP BY version incurs a higher cost than the MAX/MIN version. The steps after that have negligible cost for both versions.

Counter-intuitively, in spite of the GROUP BY version having a slightly higher cost, it runs in slightly less time. I guess it's still possible to consume more CPU cycles while running if parallelism is greater. At this point I've reached the end of my ability (and appetite) to scry DBMS execution plans so I'll leave it there.

answered Sep 22 '22 10:09

Dzamo Norton

Related questions
                            
                                Given a theorem "P(t) ⟶ (∃x . P(x))" with an object logic implication, why is the proof goal "P(t) ⟹ (∃x . P(x))" given with meta-logic implication?
                            
                                High CPU usage for a python while loop: even when sleeping 97% of the time. Why?
                            
                                Firebase Android: How to cancel download?
                            
                                Django Debug=False whitenoise not working
                            
                                C# threads when executing many tasks in Task.WhenAll
                            
                                Why don't generators transpile well?
                            
                                Too many (potentially 10,000+) drawn polygons lagging google maps
                            
                                Nvidia device error in tensorflow
                            
                                Putting all returned elements into a Spring-Boot cache using annotations
                            
                                How to center text in google sign in button android
                            
                                Left censoring for survival data in R
                            
                                wtforms validating dropdown values

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Speed of MIN/MAX vs GROUP BY for aggregating constant-in-group values

Tags:

Dzamo Norton

People also ask

1 Answers

Dzamo Norton

Recent Activity

Donate For Us