Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Union of arrays as aggregate function

I have the following input:

name  | count | options
-----------------------
user1 | 3     | ['option1', 'option2']
user1 | 12    | ['option2', 'option3']
user2 | 2     | ['option1', 'option3']
user2 | 1     | []

I want the following output:

name  | count | options
-----------------------
user1 | 12    | ['option1', 'option2', 'option3']
user2 | 2     | ['option1', 'option3']

I am grouping by name. For each group, the count should be aggregated as the max and the options should be aggregated as the union. I am having troubles figuring out how do the the latter.

Currently, I have this query:

with data(name, count, options) as (
    select 'user1', 12, array['option1', 'option2']::text[]
    union all
    select 'user1', 12, array['option2', 'option3']::text[]
    union all
    select 'user2', 2, array['option1', 'option3']::text[]
    union all
    select 'user2', 1, array[]::text[]
)
select name, max(count)
from data
group by name

http://rextester.com/YTZ45626

I know this can be easily done by defining a custom aggregate function, but I want to do this via a query. I understand the basics of unnest() the array (and array_agg() the results later on), but cannot figure out how to inject this in my query.

like image 513
Kate Tres Avatar asked May 18 '17 12:05

Kate Tres


People also ask

What is array aggregation?

The ARRAY_AGG function aggregates a set of elements into an array. Invocation of the ARRAY_AGG aggregate function is based on the result array type.

What is array AGG in Bigquery?

ARRAY_AGG. Returns an ARRAY of expression values. To learn more about the optional arguments in this function and how to use them, see Aggregate function calls. To learn more about the OVER clause and how to use it, see Window function calls.

Which of the following is not an aggregate function?

Which of the following is not a built in aggregate function in SQL? Explanation: SQL does not include total as a built in aggregate function. The avg is used to find average, max is used to find the maximum and the count is used to count the number of values. 2.

What are aggregate function in SQL?

An aggregate function performs a calculation on a set of values, and returns a single value. Except for COUNT(*) , aggregate functions ignore null values. Aggregate functions are often used with the GROUP BY clause of the SELECT statement. All aggregate functions are deterministic.


1 Answers

You can use an implicit lateral join using unnest(options) in the FROM list, and then using array_agg(distinct v) to create an array with the options:

with data(name, count, options) as (
    select 'user1', 12, array['option1', 'option2']::text[]
    union all
    select 'user1', 12, array['option2', 'option3']::text[]
    union all
    select 'user2', 2, array['option1', 'option3']::text[]
    union all
    select 'user2', 1, array[]::text[]
)
select name, array_agg(distinct v)  -- the 'v' here refers to the 'f(v)' alias below
from data, unnest(options) f(v)
group by name;
┌───────┬───────────────────────────┐
│ name  │         array_agg         │
├───────┼───────────────────────────┤
│ user1 │ {option1,option2,option3} │
│ user2 │ {option1,option3}         │
└───────┴───────────────────────────┘
(2 rows)
like image 54
Marth Avatar answered Sep 19 '22 17:09

Marth