Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Concatenate the values of a nested field on BigQuery

Supose I have the following schema:

[
   {
        'name': 'id',
        'type': 'INTEGER' 
   }
   {
        'name': 'record',
        'type': 'RECORD',
        'fields': [
            {
                'name': 'repeated',
                'type': 'STRING',
                'mode': 'REPEATED'
            }
         ]
   }
]

And the following data:

+--------------------+
|id  |record.repeated|
+--------------------+
|1   |'a'            |
|    |'b'            |
|    |'c'            |
+--------------------+
|2   |'a'            |
|    |'c'            |
+--------------------+
|3   |'d'            |
+--------------------+

What I need is to create a query that returns this:

+--------------------+
|id  |record.repeated|
+--------------------+
|1   |'a,b,c'        |
+--------------------+
|2   |'a,c'          |
+--------------------+
|3   |'d'            |
+--------------------+

In other words, I need to query that allows me to concatenate the values of a nested field using a separator (in this case, comma). Something like the GROUP_CONCAT function of MySQL, but on BigQuery.

Related idea: Concat all column values in sql

Is that possible?

Thanks.

like image 397
Gilberto Torrezan Avatar asked Apr 07 '15 07:04

Gilberto Torrezan


People also ask

How do you concatenate in BigQuery?

Note: You can also use the || concatenation operator to concatenate values into a string.

How do you query a nested field in BigQuery?

BigQuery automatically flattens nested fields when querying. To query a column with nested data, each field must be identified in the context of the column that contains it. For example: customer.id refers to the id field in the customer column.

How do you offset in BigQuery?

OFFSET means that the numbering starts at zero, ORDINAL means that the numbering starts at one. A given array can be interpreted as either 0-based or 1-based. When accessing an array element, you must preface the array position with OFFSET or ORDINAL , respectively; there is no default behavior.


2 Answers

It's very simple

select group_concat(record.repeated) from table

an example from publicdata is

SELECT group_concat(payload.shas.encoded)
FROM [publicdata:samples.github_nested]
WHERE repository.url='https://github.com/dreamerslab/workspace'
like image 56
Pentium10 Avatar answered Sep 18 '22 00:09

Pentium10


For standard sql:

select id, string_agg(record.field)
from your_table, unnest(record)

or

select id, string_agg(record.field)
from your_table left join unnest(record)
like image 40
Asad Rauf Avatar answered Sep 19 '22 00:09

Asad Rauf