How do I get min, median and max from my query in postgresql?

Tags:

postgresql

I have written a query in which one column is a month. From that I have to get min month, max month, and median month. Below is my query.

select ext.employee,        pl.fromdate,        ext.FULL_INC as full_inc,        prevExt.FULL_INC as prevInc,        (extract(year from age (pl.fromdate))*12 +extract(month from age (pl.fromdate))) as month,        case          when prevExt.FULL_INC is not null then (ext.FULL_INC -coalesce(prevExt.FULL_INC,0))          else 0        end as difference,        (case when prevExt.FULL_INC is not null then (ext.FULL_INC - prevExt.FULL_INC) / prevExt.FULL_INC*100 else 0 end) as percent from pl_payroll pl   inner join pl_extpayfile ext           on pl.cid = ext.payrollid          and ext.FULL_INC is not null   left outer join pl_extpayfile prevExt                on prevExt.employee = ext.employee               and prevExt.cid = (select max (cid) from pl_extpayfile                                  where employee = prevExt.employee                                  and   payrollid = (                                    select max(p.cid)                                    from pl_extpayfile,                                         pl_payroll p                                    where p.cid = payrollid                                    and   pl_extpayfile.employee = prevExt.employee                                    and   p.fromdate < pl.fromdate                                  ))                and coalesce(prevExt.FULL_INC, 0) > 0  where ext.employee = 17  and (exists (     select employee     from pl_extpayfile preext     where preext.employee = ext.employee     and   preext.FULL_INC <> ext.FULL_INC     and   payrollid in (       select cid       from pl_payroll       where cid = (         select max(p.cid)         from pl_extpayfile,              pl_payroll p         where p.cid = payrollid         and   pl_extpayfile.employee = preext.employee         and   p.fromdate < pl.fromdate       )     )   )   or not exists (     select employee     from pl_extpayfile fext,          pl_payroll p     where fext.employee = ext.employee     and   p.cid = fext.payrollid     and   p.fromdate < pl.fromdate     and   fext.FULL_INC > 0   ) ) order by employee,          ext.payrollid desc

If it is not possible, than is it possible to get max month and min month?

814

asked Aug 22 '12 06:08

2 Answers

To calculate the median in PostgreSQL, simply take the 50% percentile (no need to add extra functions or anything):

SELECT PERCENTILE_CONT(0.5) WITHIN GROUP(ORDER BY x) FROM t;

answered Sep 23 '22 00:09

Tobi Oetiker

You want the aggregate functions named min and max. See the PostgreSQL documentation and tutorial:

http://www.postgresql.org/docs/current/static/tutorial-agg.html
http://www.postgresql.org/docs/current/static/functions-aggregate.html

There's no built-in median in PostgreSQL, however one has been implemented and contributed to the wiki:

http://wiki.postgresql.org/wiki/Aggregate_Median

It's used the same way as min and max once you've loaded it. Being written in PL/PgSQL it'll be a fair bit slower, but there's even a C version there that you could adapt if speed was vital.

UPDATE After comment:

It sounds like you want to show the statistical aggregates alongside the individual results. You can't do this with a plain aggregate function because you can't reference columns not in the GROUP BY in the result list.

You will need to fetch the stats from subqueries, or use your aggregates as window functions.

Given dummy data:

CREATE TABLE dummystats ( depname text, empno integer, salary integer ); INSERT INTO dummystats(depname,empno,salary) VALUES ('develop',11,5200), ('develop',7,4200), ('personell',2,5555), ('mgmt',1,9999999);

... and after adding the median aggregate from the PG wiki:

You can do this with an ordinary aggregate:

regress=# SELECT min(salary), max(salary), median(salary) FROM dummystats;  min  |   max   |         median           ------+---------+----------------------  4200 | 9999999 | 5377.5000000000000000 (1 row)

but not this:

regress=# SELECT depname, empno, min(salary), max(salary), median(salary) regress-# FROM dummystats; ERROR:  column "dummystats.depname" must appear in the GROUP BY clause or be used in an aggregate function

because it doesn't make sense in the aggregation model to show the averages alongside individual values. You can show groups:

regress=# SELECT depname, min(salary), max(salary), median(salary)  regress-# FROM dummystats GROUP BY depname;   depname  |   min   |   max   |          median           -----------+---------+---------+-----------------------  personell |    5555 |    5555 | 5555.0000000000000000  develop   |    4200 |    5200 | 4700.0000000000000000  mgmt      | 9999999 | 9999999 |  9999999.000000000000 (3 rows)

... but it sounds like you want the individual values. For that, you must use a window, a feature new in PostgreSQL 8.4.

regress=# SELECT depname, empno,                   min(salary) OVER (),                   max(salary) OVER (),                   median(salary) OVER ()            FROM dummystats;    depname  | empno | min  |   max   |        median          -----------+-------+------+---------+-----------------------  develop   |    11 | 4200 | 9999999 | 5377.5000000000000000  develop   |     7 | 4200 | 9999999 | 5377.5000000000000000  personell |     2 | 4200 | 9999999 | 5377.5000000000000000  mgmt      |     1 | 4200 | 9999999 | 5377.5000000000000000 (4 rows)

Craig Ringer

Related questions
                            
                                Getting error function to_date(timestamp without time zone, unknown) does not exist
                            
                                Hibernate: Create Index
                            
                                Unable to use table named "user" in postgresql hibernate
                            
                                can't delete object due to foreign key constraint
                            
                                Hibernate startup very slow
                            
                                Is it possible to turn off quote processing in the Postgres COPY command with CSV format?
                            
                                How to insert CSV data into PostgreSQL database (remote database )
                            
                                How to select using WITH RECURSIVE clause [closed]
                            
                                Convert date from long time postgres
                            
                                Get records where json column key is null
                            
                                Trying to set up postgres for ror app, getting error - fe_sendauth: no password supplied
                            
                                Adding value to Postgres integer array
                            
                                psycopg2 leaking memory after large query
                            
                                PG::DuplicateTable: ERROR: relation "posts" already exists
                            
                                Update with result from cte (postgresql)
                            
                                error: ALTER TYPE ... ADD cannot run inside a transaction block
                            
                                postgresql query to show the groups of a user
                            
                                How to recreate a deleted table with Django Migrations?
                            
                                Importing .sql file on windows to postgresql
                            
                                python pip install psycopg2 install error

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With