Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

STRING_AGG in Bigquery

I have a problem with STRING_AGG in Bigquery. I'm trying:

SELECT
 id,
 institution,
 COUNT(DISTINCT institution)  OVER (PARTITION BY id) as count_intitution
 STRING_AGG(DISTINCT institution,"," )  OVER (PARTITION BY id) as list_intitution
FROM
 name_table
WHERE
 DATE(created_at) = "2020-02-02"

and i get this error:

Analytic function string_agg does not support DISTINCT.

BQ documentation says it allows the use of "DISTINCT"

https://cloud.google.com/bigquery/docs/reference/standard-sql/functions-and-operators#string_agg

But apparently it doesn't support "partition by", why?

EDIT:

the current table is like this (it is an example, the table has more attributes)

|id |institution|
|1  | a         |
|1  | b         |
|2  | a         |
|2  | c         |
|3  | a         |
|1  | a         |

and what I want to achieve is

|id|count_institution|list_institution|
|1 |2                |a,b             |
|2 |2                |a,c             |
|3 |1                |a               |
like image 550
franco pina Avatar asked Apr 30 '26 16:04

franco pina


1 Answers

Below is for BigQuery Standard SQL

#standardSQL
SELECT * 
  REPLACE((
      SELECT STRING_AGG(DISTINCT i) FROM t.list_intitution i
    ) AS list_intitution
  ) 
FROM (
  SELECT
   id,
   institution,
   COUNT(DISTINCT institution)  OVER (PARTITION BY id) AS count_intitution,
   ARRAY_AGG(institution) OVER (PARTITION BY id) AS list_intitution
  FROM
   name_table
  WHERE
   DATE(created_at) = "2020-02-02"
) t  

Note: in your original query you just remove DISTINCT and use ARRAY_AGG instead of STRING_AGG, but then in outer query you process this array to form list of distinct values from that array

Below is answer on your updated question

You can simply use GROUP BY as in below example

#standardSQL
SELECT id, 
  COUNT(DISTINCT institution) AS count_institution,
  STRING_AGG(DISTINCT institution) AS list_institution
FROM name_table
GROUP BY id

If to apply to sample data from your question, as in below example

#standardSQL
WITH name_table AS (
  SELECT 1 id, 'a' institution UNION ALL
  SELECT 1, 'b' UNION ALL
  SELECT 2, 'a' UNION ALL
  SELECT 2, 'c' UNION ALL
  SELECT 3, 'a' UNION ALL
  SELECT 1, 'a' 
)
SELECT id, 
  COUNT(DISTINCT institution) AS count_institution,
  STRING_AGG(DISTINCT institution) AS list_institution
FROM name_table
GROUP BY id

result is

Row id  count_institution   list_institution     
1   1   2                   a,b  
2   2   2                   a,c  
3   3   1                   a    
like image 200
Mikhail Berlyant Avatar answered May 02 '26 05:05

Mikhail Berlyant



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!