Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Partition Function COUNT() OVER possible using DISTINCT

I'm trying to write the following in order to get a running total of distinct NumUsers, like so:

NumUsers = COUNT(DISTINCT [UserAccountKey]) OVER (PARTITION BY [Mth]) 

Management studio doesn't seem too happy about this. The error disappears when I remove the DISTINCT keyword, but then it won't be a distinct count.

DISTINCT does not appear to be possible within the partition functions. How do I go about finding the distinct count? Do I use a more traditional method such as a correlated subquery?

Looking into this a bit further, maybe these OVER functions work differently to Oracle in the way that they cannot be used in SQL-Server to calculate running totals.

I've added a live example here on SQLfiddle where I attempt to use a partition function to calculate a running total.

like image 250
whytheq Avatar asked Jun 26 '12 07:06

whytheq


People also ask

Can you use distinct in partition by?

Count Distinct is not supported by window partitioning, we need to find a different way to achieve the same result.

Can count be used with distinct?

Yes, you can use COUNT() and DISTINCT together to display the count of only distinct rows. SELECT COUNT(DISTINCT yourColumnName) AS anyVariableName FROM yourTableName; To understand the above syntax, let us create a table.

What can I use instead of Count distinct?

now you can use the expression Sum(Keycounter) instead of Count(Distinct %Keytobecounted).

How can we count distinct records?

The COUNT DISTINCT function returns the number of unique values in the column or expression, as the following example shows. SELECT COUNT (DISTINCT item_num) FROM items; If the COUNT DISTINCT function encounters NULL values, it ignores them unless every value in the specified column is NULL.


2 Answers

There is a very simple solution using dense_rank()

dense_rank() over (partition by [Mth] order by [UserAccountKey])  + dense_rank() over (partition by [Mth] order by [UserAccountKey] desc)  - 1 

This will give you exactly what you were asking for: The number of distinct UserAccountKeys within each month.

like image 110
David Avatar answered Sep 23 '22 21:09

David


Necromancing:

It's relativiely simple to emulate a COUNT DISTINCT over PARTITION BY with MAX via DENSE_RANK:

;WITH baseTable AS (     SELECT 'RM1' AS RM, 'ADR1' AS ADR     UNION ALL SELECT 'RM1' AS RM, 'ADR1' AS ADR     UNION ALL SELECT 'RM2' AS RM, 'ADR1' AS ADR     UNION ALL SELECT 'RM2' AS RM, 'ADR2' AS ADR     UNION ALL SELECT 'RM2' AS RM, 'ADR2' AS ADR     UNION ALL SELECT 'RM2' AS RM, 'ADR3' AS ADR     UNION ALL SELECT 'RM3' AS RM, 'ADR1' AS ADR     UNION ALL SELECT 'RM2' AS RM, 'ADR1' AS ADR     UNION ALL SELECT 'RM3' AS RM, 'ADR1' AS ADR     UNION ALL SELECT 'RM3' AS RM, 'ADR2' AS ADR ) ,CTE AS (     SELECT RM, ADR, DENSE_RANK() OVER(PARTITION BY RM ORDER BY ADR) AS dr      FROM baseTable ) SELECT      RM     ,ADR      ,COUNT(CTE.ADR) OVER (PARTITION BY CTE.RM ORDER BY ADR) AS cnt1      ,COUNT(CTE.ADR) OVER (PARTITION BY CTE.RM) AS cnt2      -- Not supported     --,COUNT(DISTINCT CTE.ADR) OVER (PARTITION BY CTE.RM ORDER BY CTE.ADR) AS cntDist     ,MAX(CTE.dr) OVER (PARTITION BY CTE.RM ORDER BY CTE.RM) AS cntDistEmu  FROM CTE 

Note:
This assumes the fields in question are NON-nullable fields.
If there is one or more NULL-entries in the fields, you need to subtract 1.

like image 38
Stefan Steiger Avatar answered Sep 21 '22 21:09

Stefan Steiger