Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Grouping by all columns except one to count distinct values on the remaining column

I have table A with five rows and the following values:

Column1  Column2  Column3  Column4
-------  -------  -------  -------
anna     ben      cat      d
anna     ben      cat      e
anna     ben      cat      f
gina     hugh     ken      m
gina     hugh     ken      p

I want to add another column called Column5. The value of Column 5 will be 3 for the first 3 rows and 2 on the next 2 rows:

Column1  Column2  Column3  Column4  Column5
-------  -------  -------  -------  -------
anna     ben      cat      d        3
anna     ben      cat      e        3
anna     ben      cat      f        3
gina     hugh     ken      m        2
gina     hugh     ken      p        2

How I did this:

SELECT DISTINCT COUNT (DISTINCT t1.Column4) AS Column5,
Column1, Column2, Column3, Column4
FROM TableA AS t1
GROUP BY Column1, Column2, Column3;

This doesn't work:

Msg 8120, Level 16, State 1, Procedure COUNT, Line 29
Column 'Column4' invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause.

Any help please? Much appreciated.

PS: If I add Column4 in the group by clause, I get only values of "1" in the result table in Column5.

like image 348
Sam Avatar asked Jan 16 '23 02:01

Sam


2 Answers

One other way to do what you want would be to select distinct rows first, then apply the windowed COUNT() function:

WITH distinctrows AS (
  SELECT DISTINCT
    Column1,
    Column2,
    Column3,
    Column4
  FROM TableA
)
SELECT
  Column1,
  Column2,
  Column3,
  Column4,
  COUNT(Column4) OVER (PARTITION BY Column1, Column2, Column3) AS Column5
FROM distinctrows
;
like image 144
Andriy M Avatar answered Jan 31 '23 00:01

Andriy M


If you didn't need DISTINCT this would be easy.

SELECT Column1,
       Column2,
       Column3,
       Column4,
       Count(Column4) OVER (partition BY Column1, Column2, Column3) AS Column5
FROM   TableA AS t1 

But windowed aggregates in SQL Server don't currently support DISTINCT so you can use

WITH CTE
     AS (SELECT Column1,
                Column2,
                Column3,
                Count(DISTINCT Column4) AS Column5
         FROM   TableA
         GROUP  BY Column1,
                   Column2,
                   Column3)
SELECT A.Column1,
       A.Column2,
       A.Column3,
       A.Column4,
       C.Column5
FROM   TableA A
       JOIN CTE C
         ON A.Column1 = C.Column1
            AND A.Column2 = C.Column2
            AND A.Column3 = C.Column3 

(I have assumed the columns are not nullable for simplicity)

like image 25
Martin Smith Avatar answered Jan 30 '23 23:01

Martin Smith