Whenever we use an aggregate function in SQL (<code>MIN</code>, <code>MAX</code>, <code>AVG</code> etc), we must always <code>GROUP BY</code> all non-aggregated columns, for instance: <pre class="prettyprint"><code>SELECT storeid, storename, SUM(revenue), COUNT(*) FROM Sales GROUP BY storeid, storename </code></pre> It becomes even more intrusive when we use a function or other calculation in our SELECT statement, as this must also be copied to the GROUP BY clause. <pre class="prettyprint"><code>SELECT (2 * (x + y)) / z + 1, MyFunction(x, y), SUM(z) FROM AnotherTable GROUP BY (2 * (x + y)) / z + 1, MyFunction(x, y) </code></pre> If we ever change the SELECT statement, we must remember to make the same change to our GROUP BY clause. So is the GROUP BY clause is redundant? <ul> <li>If this is indeed the case, then why is there a GROUP BY clause in SQL at all?</li> <li>If this is not the case, then what extra functionality does GROUP BY give us?</li> </ul>

I may agree with what you're saying, but it is not redundant in all cases. Consider this: <pre class="prettyprint"><code>SELECT FirstName + ' (' + REPLACE(Address1, ',', ' ') + ' ' + REPLACE(Address2, ',', ' ') + ', ' + UPPER(State) + ' ' + 'USA)', COUNT(*) FROM Profiles GROUP BY FirstName, Address1, Address2, State </code></pre> In this case I just want the number of same-first-name, same-address profiles. As you can see, I didn't have to repeat the "complex" operations of the <code>SELECT</code> in the <code>GROUP BY</code> statement. I think to allow this "sometimes like this, sometimes like that", you are taxed with having to do repetitions most of the time.

Is the GROUP BY clause in SQL redundant?

Tags:

sql

group-by

Whenever we use an aggregate function in SQL (MIN, MAX, AVG etc), we must always GROUP BY all non-aggregated columns, for instance:

Click to copy

SELECT storeid, storename, SUM(revenue), COUNT(*)
FROM Sales 
GROUP BY storeid, storename

It becomes even more intrusive when we use a function or other calculation in our SELECT statement, as this must also be copied to the GROUP BY clause.

Click to copy

SELECT (2 * (x + y)) / z + 1, MyFunction(x, y), SUM(z)
FROM AnotherTable
GROUP BY (2 * (x + y)) / z + 1, MyFunction(x, y)

If we ever change the SELECT statement, we must remember to make the same change to our GROUP BY clause.

So is the GROUP BY clause is redundant?

If this is indeed the case, then why is there a GROUP BY clause in SQL at all?
If this is not the case, then what extra functionality does GROUP BY give us?

992

asked Dec 22 '10 01:12

Mike Chamberlain

2 Answers

Whenever we use an aggregate function in SQL (MIN, MAX, AVG etc), we must always GROUP BY all non-aggregated columns

This is not true in general. MySQL for example doesn't require this, and the SQL standard doesn't say this either.

Debunking GROUP BY myths

It becomes even more intrusive when we use a function or other calculation in our SELECT statement, as this must also be copied to the GROUP BY clause.

Also not true in general. MySQL (and perhaps other databases too) allow column aliases to be used in the GROUP BY clause:

Click to copy

SELECT (2 * (x + y)) / z + 1 AS a, MyFunction(x, y) AS b, SUM(z)
FROM AnotherTable
GROUP BY a, b

If this is not the case, then what extra functionality does GROUP BY give us?

The only way of specifying what to group by is to use a GROUP BY clause. You cannot necessarily deduce it from the columns mentioned in the SELECT. In fact you don't even have to select all the columns mentioned in the GROUP BY:

Click to copy

SELECT MAX(col2)
FROM foo
GROUP BY col1
HAVING COUNT(*) = 2

123

answered Sep 28 '22 02:09

Mark Byers

I may agree with what you're saying, but it is not redundant in all cases.

Consider this:

Click to copy

SELECT FirstName 
       + ' (' + REPLACE(Address1, ',', ' ') + ' '
       + REPLACE(Address2, ',', ' ') + ', '
       + UPPER(State) + ' '
       + 'USA)',
       COUNT(*)
FROM Profiles
GROUP BY FirstName, Address1, Address2, State

In this case I just want the number of same-first-name, same-address profiles.
As you can see, I didn't have to repeat the "complex" operations of the SELECT in the GROUP BY statement.

I think to allow this "sometimes like this, sometimes like that", you are taxed with having to do repetitions most of the time.

answered Sep 28 '22 01:09

BeemerGuy

Related questions
                            
                                Calculate loads and avoiding cursors
                            
                                Using django how can I combine two queries from separate models into one query?
                            
                                Difference between creating Guid keys in C# vs. the DB
                            
                                SQL - When would an empty OVER clause be used?
                            
                                Data Modeling: Is it always necessary to use an intersection table?
                            
                                is it a good idea to handle deadlock retry from stored procedure catch block
                            
                                Oracle: Using a database link in a stored procedure : table or view does not exist
                            
                                MySQL: Limiting number of results received based on a column value | Combining queries
                            
                                sql select records having count > 1 where at lease one record has value
                            
                                Restore deleted records in PostgreSQL
                            
                                SQLite: Insert current timestamp with milliseconds precision
                            
                                INSERT in a ONE to ONE Relationship
                            
                                Calculate Price For Overlapping Date Range
                            
                                SQL Server 2012 ISDATE() [duplicate]
                            
                                Analytic count over partition with and without ORDER BY clause
                            
                                Tools to work with stored procedures in Oracle, in a team?
                            
                                Implementing Wilson Score in SQL
                            
                                SQL-Query: EXISTS in Subtable
                            
                                SQL interview question
                            
                                SELECT DISTINCT HAVING Count unique conditions

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Is the GROUP BY clause in SQL redundant?

Tags:

sql

group-by

Mike Chamberlain

People also ask

2 Answers

Mark Byers

BeemerGuy

Recent Activity

Donate For Us