I'm using PostgreSQL 9.1 and I have this data structure:
A     B
-------
1     a
1     a
1     b
1     c
1     c
1     c
1     d
2     e
2     e
I need a query that produces this result:
1    4     {{c,3},{a,2},{b,1},{d,1}}
2    1     {{e,2}}
A=1, 4 rows total with A=1, the partial counts (3 rows with c value, 2 rows with a value, .....)
The sort needed for the array is based of the count of each group (like the example 3,2,1,1).
This should do the trick:
SELECT a
     , sum(ab_ct)::int AS ct_total
     , count(*)::int   AS ct_distinct_b
     , array_agg(b || ', ' || ab_ct::text) AS b_arr
FROM  (
    SELECT a, b, count(*) AS ab_ct
    FROM   tbl
    GROUP  BY a, b
    ORDER  BY a, ab_ct DESC, b  -- append "b" to break ties in the count
    ) t
GROUP  BY a
ORDER  BY ct_total DESC;
Returns:
ct_total: total count of b per a.ct_distinct_b: count of distinct b per a.b_arr: array of b plus frequency of b, sorted by frequency of b.Ordered by total count of b per a.
Alternatively, you can use an ORDER BY clause within the aggregate call in PostgreSQL 9.0 or later. Like: 
SELECT a
     , sum(ab_ct)::int AS ct_total
     , count(*)::int   AS ct_distinct_b
     , array_agg(b || ', ' || ab_ct::text ORDER BY a, ab_ct DESC, b) AS b_arr
FROM  (
    SELECT a, b, count(*) AS ab_ct
    FROM   tbl
    GROUP  BY a, b
    ) t
GROUP  BY a
ORDER  BY ct_total DESC;
May be clearer. But it's typically slower. And sorting rows in a subquery works for simple queries like this one. More explanation:
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With