array of distinct values aggregated from an array column in Postgres

Tags:

Suppose we have (in PostgreSQL 9.1) a table with some identifier, a column of type integer[] and some other columns (at least one, although there might be more) of type integer (or any other which can be summed).

The goal is to have an aggregate giving for each identifier sum of the "summable" column and an array of all distinct elements of the array column.

The only way I can find is to use unnest function on the array column in a subquery and than join it with another subquery aggregating the "summable" columns.

A simple example is as follows:

CREATE TEMP TABLE a (id integer, aint integer[], summable_val integer);
INSERT INTO a VALUES
(1, array[1,2,3], 5),
(2, array[2,3,4], 6),
(3, array[3,4,5], 2),
(1, array[7,8,9], 19);

WITH u AS (
SELECT id, unnest(aint) as t FROM a GROUP BY 1,2
),
d AS (
SELECT id, array_agg(distinct t) ar FROM u GROUP BY 1),
v as (
SELECT id, sum(summable_val) AS val
FROM a GROUP BY 1
)
SELECT v.id, v.val, d.ar
FROM v
JOIN d
ON   v.id = d.id;

The code above does what I intended but the question is can we do any better? Main drawback of this solution is that it reads and aggregate table twice which might be troublesome for larger tables.

Some other solution to the general problem is to avoid using the array column and agregate "summable" column for each array member and then use array_agg in aggregation - but at least for now I'd like to stick to this array way.

Thanks in advance for any ideas.

373

asked Feb 18 '13 11:02

One Data Guy

1 Answers

The query may be a little bit faster (I suppose) but I cannot see any remarkable optimizations:

select a.id, sum(summable_val) val, ar
from
    (select id, array_agg(distinct t) ar 
        from 
        (select id, unnest(aint) as t from a group by 1,2) u
    group by 1) x
    join a on x.id = a.id
group by 1,3

165

answered Oct 08 '22 21:10

klin

Related questions
                            
                                Is there a way to have SSMS open a certain connections every time I open it up?
                            
                                Java: commit vs rollback vs nothing when semantics is unchanged?
                            
                                MS Access: WHERE-EXISTS-clause not working on views?
                            
                                MySQL UPDATE, MAX, JOIN query
                            
                                Is there any harm to having a duplicate index in Postgresql?
                            
                                Cannot set IDENTITY_INSERT in batch
                            
                                ORDER BY varchar with [a-9] instead of [0-Z] in SQL
                            
                                Instead Trigger or Calculated Column? which is better?
                            
                                How do I model a many-to-many relationship over 3 tables in SQLAlchemy (ORM)?
                            
                                Invalid because it is not contained in an aggregate function or the group by clause
                            
                                SQL: Saving MIME-Type or Extension?
                            
                                Can SQLAlchemy use the "from only" clause of PostgreSQL?
                            
                                SQLite creating uploading and managing an online database
                            
                                SQL -- Dividing two results
                            
                                SQL Query - Combine DISTINCT and TOP?
                            
                                Primary Key and Unique Index -- sql scripts generated by SQL Developer
                            
                                Bool support Oracle SQL
                            
                                query to count number of different values?
                            
                                How to select 20 random questions from each set in sql?
                            
                                Order of execution in SQL Server variable assignment using SELECT

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

array of distinct values aggregated from an array column in Postgres

Tags:

arrays

sql

postgresql

aggregate-functions

One Data Guy

People also ask

1 Answers

klin

Recent Activity

Donate For Us