Is it possible to count distinct values in conjunction with window functions like OVER(PARTITION BY id)
? Currently my query is as follows:
SELECT congestion.date, congestion.week_nb, congestion.id_congestion, congestion.id_element, ROW_NUMBER() OVER( PARTITION BY congestion.id_element ORDER BY congestion.date), COUNT(DISTINCT congestion.week_nb) OVER( PARTITION BY congestion.id_element ) AS week_count FROM congestion WHERE congestion.date >= '2014.01.01' AND congestion.date <= '2014.12.31' ORDER BY id_element, date
However, when I try to execute the query I get the following error:
"COUNT(DISTINCT": "DISTINCT is not implemented for window functions"
DISTINCT does not appear to be possible within the partition functions.
Msg 10759, Level 15, State 1, Line 1 Use of DISTINCT is not allowed with the OVER clause.
Note: COUNT(DISTINCT) window functions cannot have an ORDER BY clause in the window definition or a non-default window frame. The following example uses COUNT(DISTINCT expression) . The query counts the number of distinct full-time scores that start with 7 for rows partitioned by half-time scores.
It's slow because the database is iterating over all the logs and all the dashboards, then joining them, then sorting them, all before getting down to real work of grouping and aggregating.
No, as the error message states, DISTINCT
is not implemented with windows functions. Aplying info from this link into your case you could use something like:
WITH uniques AS ( SELECT congestion.id_element, COUNT(DISTINCT congestion.week_nb) AS unique_references FROM congestion WHERE congestion.date >= '2014.01.01' AND congestion.date <= '2014.12.31' GROUP BY congestion.id_element ) SELECT congestion.date, congestion.week_nb, congestion.id_congestion, congestion.id_element, ROW_NUMBER() OVER( PARTITION BY congestion.id_element ORDER BY congestion.date), uniques.unique_references AS week_count FROM congestion JOIN uniques USING (id_element) WHERE congestion.date >= '2014.01.01' AND congestion.date <= '2014.12.31' ORDER BY id_element, date
Depending on the situation you could also put a subquery straight into SELECT
-list:
SELECT congestion.date, congestion.week_nb, congestion.id_congestion, congestion.id_element, ROW_NUMBER() OVER( PARTITION BY congestion.id_element ORDER BY congestion.date), (SELECT COUNT(DISTINCT dist_con.week_nb) FROM congestion AS dist_con WHERE dist_con.date >= '2014.01.01' AND dist_con.date <= '2014.12.31' AND dist_con.id_element = congestion.id_element) AS week_count FROM congestion WHERE congestion.date >= '2014.01.01' AND congestion.date <= '2014.12.31' ORDER BY id_element, date
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With