Is the HAVING clause redundant?

Question

The following two queries yield the exact same result:

select country, count(organization) as N
from ismember
group by country
having N > 50;

select * from (
  select country, count(organization) as N
  from ismember
  group by country) x
where N > 50;

Can every HAVING clause be replaced by a sub-query and a WHERE clause like this? Or are there situations where a HAVING clause is absolutely necessary/more powerful/more efficient/whatever?

Eugen Rieck · Accepted Answer

There are 2 questions asked here: The answer to the first of which is yes: The resultset of a HAVING-laden query is identical to the resultset of the same query executed as a subquery, decorated with a WHERE clause.

The second question is about performance and expressivity - here we go heavily into implementation. On MySQL there is a thin red line, where the performance starts to drift apart: The moment the resultset of the inner query can no longer be held in memory. In this case, MySQL will create an on-disk representation of the inner query, then use the WHERE selector on it. This will not happen, if the HAVING clause is used, the disqualified group will be dropped from the result set.

This implies, that the higher the selectivity of the HAVING clause, the more performance relevance it has: Consider result set of a million rows of the inner query, that is reduce by the HAVING clause to 5 rows - it is very likely, that the result set of the inner query wouldn't be held in memory, but it is very likely, that the final result set would.

Edit

I had this once: The query selected the few outliers from a very evenly distributed table (Number of pieces produced on a physical machine in a workshop per day). I investigated because of the high IO-load.

Edit 2

Please keep in mind, that the query cache is not used for subqueries - IMHO a place development should focus more on - so the subquery pattern will not profit from the inner query being a cached result set.

Gert Arnold · Answer

In Sql Server 2008 two similar queries have exactly the same execution plan:

enter image description here

I've also studied a lot of queries generated by Entity Framework (with SS 2008) and so far I never saw a query with a HAVING clause. Grouping queries with a condition on an aggregated result are always translated into a query with a sub query. I trust the ADO.Net team knows with they're doing...

Rob Farley · Answer

The HAVING clause is very useful to avoid the added complexity of sub-queries. However, the two are logically equivalent and every HAVING clause can be rewritten using a sub-query as you have.

In case you're curious, you could also write every WHERE clause as a HAVING clause if you're prepared to take GROUP BY to the extreme.

Is the HAVING clause redundant?

Tags:

sql

mysql

group-by

having

having-clause

fredoverflow

3 Answers

Eugen Rieck

Gert Arnold

Rob Farley

Recent Activity

Donate For Us

Is the HAVING clause redundant?

Tags:

sql

mysql

group-by

having

having-clause

fredoverflow

3 Answers

Eugen Rieck

Gert Arnold

Rob Farley

Related questions

Recent Activity

Donate For Us