Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Efficiently Include Column not in Group By of SQL Query

Given

Table A

Id   INTEGER
Name VARCHAR(50)

Table B

Id   INTEGER
FkId INTEGER  ; Foreign key to Table A

I wish to count the occurrances of each FkId value:

SELECT FkId, COUNT(FkId) 
FROM B 
GROUP BY FkId

Now I simply want to also output the Name from Table A.

This will not work:

SELECT FkId, COUNT(FkId), a.Name
FROM B b
INNER JOIN A a ON a.Id=b.FkId
GROUP BY FkId

because a.Name is not contained in the GROUP BY clause (produces is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause error).

The point is to move from output like this

FkId  Count
1      42
2      25

to output like this

FkId  Count  Name
1      42     Ronald
2      22     John

There are quite a few matches on SO for that error message, but some e.g. https://stackoverflow.com/a/6456944/141172 have comments like "will generate 3 scans on the table, rather than 1, so won't scale".

How can I efficiently include a field from the joined Table B (which has a 1:1 relationship to FkId) in the query output?

like image 591
Eric J. Avatar asked Jul 06 '12 20:07

Eric J.


People also ask

Can we SELECT column which is not part of GROUP BY?

The direct answer is that you can't. You must select either an aggregate or something that you are grouping by.

Do I need to include all columns in GROUP BY?

If you specify the GROUP BY clause, columns referenced must be all the columns in the SELECT clause that do not contain an aggregate function. These columns can either be the column, an expression, or the ordinal number in the column list.

Does GROUP BY improve query performance?

Conclusion. GROUP BY is a powerful statement, but it tends to slow down queries. Over time, my team and I have used it many times and defined SQL indexes to avoid the performance issues introduced by the GROUP BY clause, especially when dealing with large tables.

Can we use where without GROUP BY in SQL?

A query with a having clause should also have a group by clause. If you omit group by, all the rows not excluded by the where clause return as a single group. Because no grouping is performed between the where and having clauses, they cannot act independently of each other.


1 Answers

You can try something like this:

   ;WITH GroupedData AS
   (
       SELECT FkId, COUNT(FkId) As FkCount
       FROM B 
       GROUP BY FkId
   ) 
   SELECT gd.*, a.Name
   FROM GroupedData gd
   INNER JOIN dbo.A ON gd.FkId = A.FkId

Create a CTE (Common Table Expression) to handle the grouping/counting on your Table B, and then join that result (one row per FkId) to Table A and grab some more columns from Table A into your final result set.

like image 116
marc_s Avatar answered Oct 14 '22 06:10

marc_s