Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Concatenate/merge array values during grouping/aggregation

Tags:

sql

postgresql

I have a table with the an array column type:

 title       tags "ridealong";"{comedy,other}" "ridealong";"{comedy,tragedy}" "freddyjason";"{horror,silliness}" 

I would like to write a query that produces a single array per title(in an ideal world it would be a set/deduplicated array)

e.g.

select array_cat(tags),title from my_test group by title 

The above query doesn't work of course, but I would like to produce 2 rows:

"ridealong";"{comedy,other,tragedy}" "freddyjason";"{horror,silliness}" 

Any help or pointers would be very much appreciated (I am using Postgres 9.1)


Based on Craig's help I ended up with the following (slightly altered syntax since 9.1 complains about the query exactly as he shows it)

SELECT t1.title, array_agg(DISTINCT tag.tag)  FROM my_test t1, (select unnest(tags) as tag,title from my_test) as tag  where tag.title=t1.title GROUP BY t1.title; 
like image 921
Yana K. Avatar asked Jun 11 '14 01:06

Yana K.


People also ask

How to concat two arrays in MongoDB?

MongoDB provides different types of array expression operators that are used in the aggregation pipeline stages and $concatArrays operator is one of them. This operator is used to concatenate two or more arrays and return a concatenated array. Here, the array must be a valid expression until it resolves to an array.

How do you concatenate an array?

In order to combine (concatenate) two arrays, we find its length stored in aLen and bLen respectively. Then, we create a new integer array result with length aLen + bLen . Now, in order to combine both, we copy each element in both arrays to result by using arraycopy() function.

What is Unnest SQL?

The UNNEST function returns a result table that includes a row for each element of the specified array. If there are multiple ordinary array arguments specified, the number of rows will match the array with the largest cardinality.

What is aggregation group SQL?

What Is Group By in SQL? The Group By statement is used to group together any rows of a column with the same value stored in them, based on a function specified in the statement. Generally, these functions are one of the aggregate functions such as MAX() and SUM().


1 Answers

Custom aggregate

Approach 1: define a custom aggregate. Here's one I wrote earlier.

CREATE TABLE my_test(title text, tags text[]);  INSERT INTO my_test(title, tags) VALUES ('ridealong', '{comedy,other}'), ('ridealong', '{comedy,tragedy}'), ('freddyjason', '{horror,silliness}');  CREATE AGGREGATE array_cat_agg(anyarray) (   SFUNC=array_cat,   STYPE=anyarray );  select title, array_cat_agg(tags) from my_test group by title; 

LATERAL query

... or since you don't want to preserve order and want to deduplicate, you could use a LATERAL query like:

SELECT title, array_agg(DISTINCT tag ORDER BY tag)  FROM my_test, unnest(tags) tag  GROUP BY title; 

in which case you don't need the custom aggregate. This one is probably a fair bit slower for big data sets due to the deduplication. Removing the ORDER BY if not required may help, though.

like image 198
Craig Ringer Avatar answered Oct 10 '22 02:10

Craig Ringer