Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to use array_agg() for varchar[]

I have a column in our database called min_crew that has varying character arrays such as '{CA, FO, FA}'.

I have a query where I'm trying to get aggregates of these arrays without success:

SELECT use.user_sched_id, array_agg(se.sched_entry_id) AS seids
     , array_agg(se.min_crew) 
FROM base.sched_entry se
   LEFT JOIN base.user_sched_entry use ON se.sched_entry_id = use.sched_entry_id
WHERE se.sched_entry_id = ANY(ARRAY[623, 625])
GROUP BY user_sched_id;

Both 623 and 625 have the same use.user_sched_id, so the result should be the grouping of the seids and the min_crew, but I just keep getting this error:

ERROR:  could not find array type for data type character varying[]

If I remove the array_agg(se.min_crew) portion of the code, I do get a table returned with the user_sched_id = 2131 and seids = '{623, 625}'.

like image 519
Darin Peterson Avatar asked Nov 22 '13 23:11

Darin Peterson


People also ask

What does ARRAY_AGG do in SQL?

The ARRAY_AGG aggregator creates a new SQL. ARRAY value per group that will contain the values of group as its items. ARRAY_AGG is not preserving order of values inside a group. If an array needs to be ordered, a LINQ OrderBy can be used.

What is ARRAY_AGG in BigQuery?

Definition. The ARRAY_AGG function in BigQuery creates an ARRAY from another expression or table. It is basically the opposite of UNNEST.


1 Answers

The standard aggregate function array_agg() only works for base types, not array types as input. (But Postgres 9.5+ has a new variant of array_agg() that can!)

You could use the custom aggregate function array_agg_mult() as defined in this related answer:
Selecting data into a Postgres array

Create it once per database. Then your query could work like this:

SELECT use.user_sched_id, array_agg(se.sched_entry_id) AS seids
      ,array_agg_mult(ARRAY[se.min_crew]) AS min_crew_arr
FROM   base.sched_entry se
LEFT   JOIN base.user_sched_entry use USING (sched_entry_id)
WHERE  se.sched_entry_id = ANY(ARRAY[623, 625])
GROUP  BY user_sched_id;

There is a detailed rationale in the linked answer.

Extents have to match

In response to your comment, consider this quote from the manual on array types:

Multidimensional arrays must have matching extents for each dimension. A mismatch causes an error.

There is no way around that, the array type does not allow such a mismatch in Postgres. You could pad your arrays with NULL values so that all dimensions have matching extents.

But I would rather translate the arrays to a comma-separated lists with array_to_string() for the purpose of this query and use string_agg() to aggregate the text - preferably with a different separator. Using a newline in my example:

SELECT use.user_sched_id, array_agg(se.sched_entry_id) AS seids
      ,string_agg(array_to_string(se.min_crew, ','), E'\n') AS min_crews
FROM   ...

Normalize

You might want to consider normalizing your schema to begin with. Typically, you would implement such an n:m relationship with a separate table like outlined in this example:
How to implement a many-to-many relationship in PostgreSQL?

like image 190
Erwin Brandstetter Avatar answered Oct 07 '22 23:10

Erwin Brandstetter