Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do I get min, median and max from my query in postgresql?

Tags:

postgresql

I have written a query in which one column is a month. From that I have to get min month, max month, and median month. Below is my query.

select ext.employee,        pl.fromdate,        ext.FULL_INC as full_inc,        prevExt.FULL_INC as prevInc,        (extract(year from age (pl.fromdate))*12 +extract(month from age (pl.fromdate))) as month,        case          when prevExt.FULL_INC is not null then (ext.FULL_INC -coalesce(prevExt.FULL_INC,0))          else 0        end as difference,        (case when prevExt.FULL_INC is not null then (ext.FULL_INC - prevExt.FULL_INC) / prevExt.FULL_INC*100 else 0 end) as percent from pl_payroll pl   inner join pl_extpayfile ext           on pl.cid = ext.payrollid          and ext.FULL_INC is not null   left outer join pl_extpayfile prevExt                on prevExt.employee = ext.employee               and prevExt.cid = (select max (cid) from pl_extpayfile                                  where employee = prevExt.employee                                  and   payrollid = (                                    select max(p.cid)                                    from pl_extpayfile,                                         pl_payroll p                                    where p.cid = payrollid                                    and   pl_extpayfile.employee = prevExt.employee                                    and   p.fromdate < pl.fromdate                                  ))                and coalesce(prevExt.FULL_INC, 0) > 0  where ext.employee = 17  and (exists (     select employee     from pl_extpayfile preext     where preext.employee = ext.employee     and   preext.FULL_INC <> ext.FULL_INC     and   payrollid in (       select cid       from pl_payroll       where cid = (         select max(p.cid)         from pl_extpayfile,              pl_payroll p         where p.cid = payrollid         and   pl_extpayfile.employee = preext.employee         and   p.fromdate < pl.fromdate       )     )   )   or not exists (     select employee     from pl_extpayfile fext,          pl_payroll p     where fext.employee = ext.employee     and   p.cid = fext.payrollid     and   p.fromdate < pl.fromdate     and   fext.FULL_INC > 0   ) ) order by employee,          ext.payrollid desc 

If it is not possible, than is it possible to get max month and min month?

like image 814
Deepak Kumar Avatar asked Aug 22 '12 06:08

Deepak Kumar


People also ask

How do I find the median in PostgreSQL?

To get the median in PostgreSQL, use percentile_cont(0.5) WITHIN GROUP (ORDER BY num_value). See the documentation for more details. The statistical median is the numerical value separating the higher half of a data sample, a population, or a probability distribution, from the lower half.

Is there a median function in PostgreSQL?

In PostgreSQL, there is no function to directly compute the median of a numerical field/column. However, since median is the 50th percentile, we can use it as a proxy to median. Percentile of a numerical variable is computed using the PERCENTILE_CONT() function.

How do I find the maximum value of a column in PostgreSQL?

PostgreSQL MAX() function is an aggregate function that returns the maximum value in a set of values. Syntax: MAX(expression); The MAX() function can be used with SELECT, WHERE and HAVING clause.

How do you find the median in a table in SQL?

We calculate the median of the Distance from the demo table. SET @rowindex := -1; SELECT AVG(d. distance) as Median FROM (SELECT @rowindex:=@rowindex + 1 AS rowindex, demo.


2 Answers

To calculate the median in PostgreSQL, simply take the 50% percentile (no need to add extra functions or anything):

SELECT PERCENTILE_CONT(0.5) WITHIN GROUP(ORDER BY x) FROM t; 
like image 70
Tobi Oetiker Avatar answered Sep 23 '22 00:09

Tobi Oetiker


You want the aggregate functions named min and max. See the PostgreSQL documentation and tutorial:

  • http://www.postgresql.org/docs/current/static/tutorial-agg.html
  • http://www.postgresql.org/docs/current/static/functions-aggregate.html

There's no built-in median in PostgreSQL, however one has been implemented and contributed to the wiki:

http://wiki.postgresql.org/wiki/Aggregate_Median

It's used the same way as min and max once you've loaded it. Being written in PL/PgSQL it'll be a fair bit slower, but there's even a C version there that you could adapt if speed was vital.

UPDATE After comment:

It sounds like you want to show the statistical aggregates alongside the individual results. You can't do this with a plain aggregate function because you can't reference columns not in the GROUP BY in the result list.

You will need to fetch the stats from subqueries, or use your aggregates as window functions.

Given dummy data:

CREATE TABLE dummystats ( depname text, empno integer, salary integer ); INSERT INTO dummystats(depname,empno,salary) VALUES ('develop',11,5200), ('develop',7,4200), ('personell',2,5555), ('mgmt',1,9999999); 

... and after adding the median aggregate from the PG wiki:

You can do this with an ordinary aggregate:

regress=# SELECT min(salary), max(salary), median(salary) FROM dummystats;  min  |   max   |         median           ------+---------+----------------------  4200 | 9999999 | 5377.5000000000000000 (1 row) 

but not this:

regress=# SELECT depname, empno, min(salary), max(salary), median(salary) regress-# FROM dummystats; ERROR:  column "dummystats.depname" must appear in the GROUP BY clause or be used in an aggregate function 

because it doesn't make sense in the aggregation model to show the averages alongside individual values. You can show groups:

regress=# SELECT depname, min(salary), max(salary), median(salary)  regress-# FROM dummystats GROUP BY depname;   depname  |   min   |   max   |          median           -----------+---------+---------+-----------------------  personell |    5555 |    5555 | 5555.0000000000000000  develop   |    4200 |    5200 | 4700.0000000000000000  mgmt      | 9999999 | 9999999 |  9999999.000000000000 (3 rows) 

... but it sounds like you want the individual values. For that, you must use a window, a feature new in PostgreSQL 8.4.

regress=# SELECT depname, empno,                   min(salary) OVER (),                   max(salary) OVER (),                   median(salary) OVER ()            FROM dummystats;    depname  | empno | min  |   max   |        median          -----------+-------+------+---------+-----------------------  develop   |    11 | 4200 | 9999999 | 5377.5000000000000000  develop   |     7 | 4200 | 9999999 | 5377.5000000000000000  personell |     2 | 4200 | 9999999 | 5377.5000000000000000  mgmt      |     1 | 4200 | 9999999 | 5377.5000000000000000 (4 rows) 

See also:

  • http://www.postgresql.org/docs/current/static/tutorial-window.html
  • http://www.postgresql.org/docs/current/static/functions-window.html
like image 20
Craig Ringer Avatar answered Sep 22 '22 00:09

Craig Ringer