I'm using PostgreSQL 9.1 and I have this data structure:
A B
-------
1 a
1 a
1 b
1 c
1 c
1 c
1 d
2 e
2 e
I need a query that produces this result:
1 4 {{c,3},{a,2},{b,1},{d,1}}
2 1 {{e,2}}
A=1, 4 rows total with A=1, the partial counts (3 rows with c value, 2 rows with a value, .....)
The sort needed for the array is based of the count of each group (like the example 3,2,1,1).
This should do the trick:
SELECT a
, sum(ab_ct)::int AS ct_total
, count(*)::int AS ct_distinct_b
, array_agg(b || ', ' || ab_ct::text) AS b_arr
FROM (
SELECT a, b, count(*) AS ab_ct
FROM tbl
GROUP BY a, b
ORDER BY a, ab_ct DESC, b -- append "b" to break ties in the count
) t
GROUP BY a
ORDER BY ct_total DESC;
Returns:
ct_total
: total count of b
per a
.ct_distinct_b
: count of distinct b
per a
.b_arr
: array of b
plus frequency of b
, sorted by frequency of b
.Ordered by total count of b
per a
.
Alternatively, you can use an ORDER BY
clause within the aggregate call in PostgreSQL 9.0 or later. Like:
SELECT a
, sum(ab_ct)::int AS ct_total
, count(*)::int AS ct_distinct_b
, array_agg(b || ', ' || ab_ct::text ORDER BY a, ab_ct DESC, b) AS b_arr
FROM (
SELECT a, b, count(*) AS ab_ct
FROM tbl
GROUP BY a, b
) t
GROUP BY a
ORDER BY ct_total DESC;
May be clearer. But it's typically slower. And sorting rows in a subquery works for simple queries like this one. More explanation:
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With