Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Using ARRAY_AGG() with all null values in input

I want to select distinct values of columns for each user in my table (in Google BigQuery). I've thought about using ARRAY_AGG() like:

SELECT user_id, ARRAY_AGG(DISTINCT field1) AS f1, ARRAY_AGG(DISTINCT field2) AS f2
FROM t GROUP BY user_id

But because for some user_ids all values in field1 or field2 are null, I get this error message: Array 'f1' cannot have a null element

I wonder if there is a workaround to avoid this error or maybe a different way to achieve the result without using ARRAY_AGG()

like image 680
S.Mohsen sh Avatar asked Oct 04 '16 05:10

S.Mohsen sh


3 Answers

BigQuery finally implemented suggestion from Elliott Brossard and now you can do:

SELECT user_id, ARRAY_AGG(DISTINCT field1 IGNORE NULLS) AS f1, ARRAY_AGG(DISTINCT field2 IGNORE NULLS) AS f2
FROM t
GROUP BY user_id
like image 79
Sebastian Kreft Avatar answered Sep 21 '22 22:09

Sebastian Kreft


From https://cloud.google.com/bigquery/sql-reference/data-types#array-type

BigQuery raises an error if query result has ARRAYs which contain NULL elements, although such ARRAYs can be used inside the query.

your query is ok in a temporary query but not ok in end query result; so a workaround is you may define your query as a tmp table and filter out NULL values before giving end results:

WITH tmp AS (SELECT user_id,
                    ARRAY_AGG(DISTINCT field1) AS f1,
                    ARRAY_AGG(DISTINCT field2) AS f2
FROM t GROUP BY user_id)

SELECT user_id,
    ARRAY(SELECT el FROM UNNEST(f1) AS el WHERE el IS NOT NULL) AS f1,
    ARRAY(SELECT el FROM UNNEST(f2) AS el WHERE el IS NOT NULL) AS f2
 FROM tmp

I've met same problems when porting over some Postgres SQL into BigQuery, a more elegant solution is FILTER clause on aggregation functions,

https://www.postgresql.org/docs/current/static/sql-expressions.html

like ARRAY_AGG(f1 FILTER WHEN f1 IS NOT NULL) which is not available in BigQuery which I really hope they can implement it

like image 28
TomasJ Avatar answered Sep 18 '22 22:09

TomasJ


WITH t1 AS (
  SELECT user_id, ARRAY_AGG(DISTINCT field1) AS f1
  FROM t WHERE field1 IS NOT NULL
  GROUP BY user_id
),
t2 AS (
  SELECT user_id, ARRAY_AGG(DISTINCT field2) AS f2
  FROM t WHERE field2 IS NOT NULL
  GROUP BY user_id
)
SELECT t1.user_id, f1, f2
FROM t1 FULL JOIN t2 
ON t1.user_id = t2.user_id
like image 41
Mikhail Berlyant Avatar answered Sep 19 '22 22:09

Mikhail Berlyant