Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Understanding how WHERE works with GROUP BY and Aggregation

Tags:

sql

My query -

select cu.CustomerID,cu.FirstName,cu.LastName, COUNT(si.InvoiceID)as inv --1
from Customer as cu inner join SalesInvoice as si --2
on cu.CustomerID = si.CustomerID -- 3
-- put the WHERE clause here ! --4   
group by cu.CustomerID,cu.FirstName,cu.LastName -- 5
where cu.FirstName = 'mark' -- 6

Output with correct code -

enter image description here

Error i get - Incorrect syntax near the keyword 'where'.

Can you tell me why I get this error ? I want to know why WHERE comes before GROUP BY and not after.

like image 579
david blaine Avatar asked Dec 22 '12 20:12

david blaine


People also ask

How does GROUP BY work with aggregate functions?

The GROUP BY statement groups rows that have the same values into summary rows, like "find the number of customers in each country". The GROUP BY statement is often used with aggregate functions ( COUNT() , MAX() , MIN() , SUM() , AVG() ) to group the result-set by one or more columns.

How do GROUP BY and HAVING clauses work with aggregate functions?

The HAVING clause is used instead of WHERE with aggregate functions. While the GROUP BY Clause groups rows that have the same values into summary rows. The having clause is used with the where clause in order to find rows with certain conditions. The having clause is always used after the group By clause.

Does WHERE clause work with aggregate functions?

An aggregate function can be used in a WHERE clause only if that clause is part of a subquery of a HAVING clause and the column name specified in the expression is a correlated reference to a group. If the expression includes more than one column name, each column name must be a correlated reference to the same group.

Can I use WHERE clause with GROUP BY?

GROUP BY Clause is utilized with the SELECT statement. GROUP BY aggregates the results on the basis of selected column: COUNT, MAX, MIN, SUM, AVG, etc. GROUP BY returns only one result per group of data. GROUP BY Clause always follows the WHERE Clause.


2 Answers

You have the order wrong. The WHERE clause goes before the GROUP BY:

select cu.CustomerID,cu.FirstName,cu.LastName, COUNT(si.InvoiceID)as inv 
from Customer as cu 
inner join SalesInvoice as si 
   on cu.CustomerID = si.CustomerID 
where cu.FirstName = 'mark' 
group by cu.CustomerID,cu.FirstName,cu.LastName

If you want to perform a filter after the GROUP BY, then you will use a HAVING clause:

select cu.CustomerID,cu.FirstName,cu.LastName, COUNT(si.InvoiceID)as inv 
from Customer as cu 
inner join SalesInvoice as si 
   on cu.CustomerID = si.CustomerID 
group by cu.CustomerID,cu.FirstName,cu.LastName
having cu.FirstName = 'mark' 

A HAVING clause is typically used for aggregate function filtering, so it makes sense that this would be applied after the GROUP BY

To learn about the order of operations here is article explaining the order. From the article the order of operation in SQL is:

To start out, I thought it would be good to look up the order in which SQL directives get executed as this will change the way I can optimize:

FROM clause
WHERE clause
GROUP BY clause
HAVING clause
SELECT clause
ORDER BY clause

Using this order you will apply the filter in the WHERE prior to a GROUP BY. The WHERE is used to limit the number of records.

Think of it this way, if you were applying the WHERE after then you would return more records then you would want to group on. Applying it first, reduces the recordset then applies the grouping.

like image 108
Taryn Avatar answered Oct 19 '22 04:10

Taryn


The where clause comes before the group by because conceptually you filter before you group, not after. You want to restrict the output of the that is grouped to only those that match rather than perform the grouping on items that you will, potentially, throw away due to the filter.

like image 35
tvanfosson Avatar answered Oct 19 '22 04:10

tvanfosson