Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

PostgreSQL statistical mode value

I am using the SQL query

    SELECT round(avg(int_value)) AS modal_value FROM t;

to obtain modal value, that, of couse, not is correct, but is a first option to show some result.

So, my question is, "How to do the thing right?".


With PostgreSQL 8.3+ we can use this user-defined agregate to define mode:

CREATE FUNCTION _final_mode(anyarray) RETURNS anyelement AS $f$
    SELECT a FROM unnest($1) a
    GROUP BY 1  ORDER BY COUNT(1) DESC, 1
    LIMIT 1;
$f$ LANGUAGE 'sql' IMMUTABLE;
CREATE AGGREGATE mode(anyelement) (
  SFUNC=array_append,  STYPE=anyarray,
  FINALFUNC=_final_mode, INITCOND='{}'
);

but, as an user-defined average, with big tables it can be slow (compare sum/count with buildin AVG function). With PostgreSQL 9+, there are no direct (buildin) function for calculate statistical mode value? Perhaps using pg_stats... How to do something like

    SELECT (most_common_vals(int_value))[1] AS modal_value FROM t;

The pg_stats VIEW can be used for this kind of task (even once, by hand)?

like image 639
Peter Krauss Avatar asked Jun 12 '26 15:06

Peter Krauss


2 Answers

Since PostgreSQL 9.4 there is a built-in aggregate function mode. It is used like

SELECT mode() WITHIN GROUP (ORDER BY some_value) AS modal_value FROM tbl;

Read more about ordered-set aggregate functions here:

36.10.3. Ordered-Set Aggregates

Built-in Ordered-Set Aggregate Functions

See other answers for dealing with older versions of Postgres.

like image 133
ars Avatar answered Jun 15 '26 05:06

ars


If you want to do it by groups:

select
    int_value * 10 / (select max(int_value) from t) g,
    min(int_value) "from",
    max(int_value) "to",
    count(*) total
from t
group by 1
order by 4 desc
like image 44
Clodoaldo Neto Avatar answered Jun 15 '26 05:06

Clodoaldo Neto



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!