How to use a SQL window function to calculate a percentage of an aggregate

Tags:

I need to calculate percentages of various dimensions in a table. I'd like to simplify things by using window functions to calculate the denominator, however I am having an issue because the numerator has to be an aggregate as well.

As a simple example, take the following table:

create temp table test (d1 text, d2 text, v numeric);
insert into test values ('a','x',5), ('a','y',5), ('a','y',10), ('b','x',20);

If I just want to calculate the share of each individual row out of d1, then windowing functions work fine:

select d1, d2, v/sum(v) over (partition by d1)
from test;

"b";"x";1.00
"a";"x";0.25
"a";"y";0.25
"a";"y";0.50

However, what I need to do is calculate the overall share for the sum of d2 out of d1. The output I am looking for is this:

"b";"x";1.00
"a";"x";0.25
"a";"y";0.75

So I try this:

select d1, d2, sum(v)/sum(v) over (partition by d1)
from test
group by d1, d2;

However, now I get an error:

ERROR:  column "test.v" must appear in the GROUP BY clause or be used in an aggregate function

I'm assuming this is because it is complaining that the window function is not accounted for in the grouping clause, however windowing functions cannot be put in the grouping clause anyway.

This is using Greenplum 4.1, which is a fork of Postgresql 8.4 and shares the same windowing functions. Note that Greenplum cannot do correlated subqueries.

540

asked Dec 15 '11 04:12

EvilPuppetMaster

Video Answer

2 Answers

I think you are looking for this:

SELECT d1, d2, sum(v)/sum(sum(v)) OVER (PARTITION BY d1) AS share
FROM   test
GROUP  BY d1, d2;

Produces the requested result.

Window functions are applied after aggregate functions. The outer sum() in sum(sum(v)) OVER ... is a window function (attached OVER ... clause) while the inner sum() is an aggregate function.

Effectively the same as:

WITH x AS (
   SELECT d1, d2, sum(v) AS sv
   FROM   test
   GROUP  BY d1, d2
   )
SELECT d1, d2, sv/sum(sv) OVER (PARTITION BY d1) AS share
FROM   x;

Or (without CTE):

SELECT d1, d2, sv/sum(sv) OVER (PARTITION BY d1) AS share
FROM  (
   SELECT d1, d2, sum(v) AS sv
   FROM   test
   GROUP  BY d1, d2
   ) x;

Or @Mu's variant.

Aside: Greenplum introduced correlated subqueries with version 4.2. See release notes.

123

answered Sep 30 '22 18:09

Erwin Brandstetter

Do you need to do it all with window functions? Sounds like you just need to group the result you have by d1 and d2 and then sum the sums:

select d1, d2, sum(p)
from (
    select d1, d2, v/sum(v) over (partition by d1) as p
    from test
) as dt
group by d1, d2

That gives me this:

 d1 | d2 |          sum           
----+----+------------------------
 a  | x  | 0.25000000000000000000
 a  | y  | 0.75000000000000000000
 b  | x  | 1.00000000000000000000

answered Sep 30 '22 19:09

mu is too short

Related questions
                            
                                SQL Where not equal does not return NULL results
                            
                                Spark SQL converting string to timestamp
                            
                                Linq to SQL - How to find the value of the IDENTITY column after InsertOnSubmit()
                            
                                MySQL: Count entries without grouping?
                            
                                INSERT rows into multiple tables in a single query, selecting from an involved table
                            
                                How to understand a database that is already developed? [closed]
                            
                                What is {ts '2013-04-02 00:00:00'}?
                            
                                How restore postgreSQL dump file using pgAdmin?
                            
                                What are the viable database abstraction layers for Python
                            
                                SQL Server - Permission on a per table basis?
                            
                                Exporting SQL query result to csv or Excel
                            
                                How to GROUP and choose lowest value in SQL
                            
                                Can you have an INNER JOIN without the ON keyword?
                            
                                Visual Studio 2012 Database Diagram?
                            
                                SQL: Delete all NOT MAX Records in GroupBy
                            
                                How to change the column length of a primary key in SQL Server?
                            
                                Can I query a record with multiple associated records that fit certain criteria?
                            
                                SQL Command Result to Dictionary C# .NET 2.0
                            
                                Can not issue data manipulation statements with executeQuery() [duplicate]
                            
                                Query extremely slow in code but fast in SSMS

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How to use a SQL window function to calculate a percentage of an aggregate

Tags:

sql

postgresql

aggregate-functions

window-functions

greenplum