I know I need to have (although I don't know why) a GROUP BY
clause on the end of a SQL query that uses any aggregate functions like count
, sum
, avg
, etc:
SELECT count(userID), userName
FROM users
GROUP BY userName
When else would GROUP BY
be useful, and what are the performance ramifications?
Syntax: SELECT column1, function_name(column2) FROM table_name WHERE condition GROUP BY column1, column2 HAVING condition ORDER BY column1, column2; function_name: Name of the function used for example, SUM() , AVG(). table_name: Name of the table. condition: Condition used.
The SQL GROUP BY StatementThe GROUP BY statement groups rows that have the same values into summary rows, like "find the number of customers in each country". The GROUP BY statement is often used with aggregate functions ( COUNT() , MAX() , MIN() , SUM() , AVG() ) to group the result-set by one or more columns.
The SQL GROUP BY clause allows us to group individual data based on defined criteria. You can group individual data by one or more table columns. In order to do the grouping properly, you often need to apply aggregate functions to the column(s) within the SQL SELECT statement.
The SQL GROUP BY clause is used in a SELECT statement to collect data across multiple records and group the results by one or more columns. A SELECT statement clause that divides the query result into groups of rows, usually for the purpose of performing one or more aggregations on each group.
To retrieve the number of widgets from each widget category that has more than 5 widgets, you could do this:
SELECT WidgetCategory, count(*)
FROM Widgets
GROUP BY WidgetCategory
HAVING count(*) > 5
The "having" clause is something people often forget about, instead opting to retrieve all their data to the client and iterating through it there.
GROUP BY is similar to DISTINCT in that it groups multiple records into one.
This example, borrowed from http://www.devguru.com/technologies/t-sql/7080.asp, lists distinct products in the Products table.
SELECT Product FROM Products GROUP BY Product
Product
-------------
Desktop
Laptop
Mouse
Network Card
Hard Drive
Software
Book
Accessory
The advantage of GROUP BY over DISTINCT, is that it can give you granular control when used with a HAVING clause.
SELECT Product, count(Product) as ProdCnt
FROM Products
GROUP BY Product
HAVING count(Product) > 2
Product ProdCnt
--------------------
Desktop 10
Laptop 5
Mouse 3
Network Card 9
Software 6
Group By forces the entire set to be populated before records are returned (since it is an implicit sort).
For that reason (and many others), never use a Group By in a subquery.
Counting the number of times tags are used might be a google example:
SELECT TagName, Count(*)
AS TimesUsed
FROM Tags
GROUP BY TagName ORDER TimesUsed
If you simply want a distinct value of tags, I would prefer to use the DISTINCT
statement.
SELECT DISTINCT TagName
FROM Tags
ORDER BY TagName ASC
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With