Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

BigQuery : is it possible to iterate over an array?

Is it possible to iterate over an array in bigquery in standard sql?

Basically declare an array of strings representing table fields. ex :

DECLARE FIELDS_TO_CHECK ARRAY<STRING>;
SET FIELDS_TO_CHECK =  ['field1', 'field2', 'field3' ];

and then iterate on this array to create requests getting percentage of non null value on this field :

select count(FIELD) / count(*) from 
'table_name'`
like image 677
jledru Avatar asked Jul 23 '20 09:07

jledru


People also ask

How do you iterate in BigQuery?

Example to use BigQuery CONTINUE and BREAK DECLARE x INT64 DEFAULT 0; LOOP SET x = x + 1; IF x >= 10 THEN CONTINUE; END IF; BREAK; END LOOP; SELECT x; Note that, ITERATE is a synonym for CONTINUE and LEAVE is a synonym for BREAK . You can use them interchangeably.

Does BigQuery support array?

With BigQuery, you can construct array literals, build arrays from subqueries using the ARRAY function, and aggregate values into an array using the ARRAY_AGG function. You can combine arrays using functions like ARRAY_CONCAT() , and convert arrays to strings using ARRAY_TO_STRING() .

How do you Unnest an array in BigQuery?

You can do this with a CROSS JOIN. A cross join will take every individual element of your unnested array and join it back to its parent row. This will create multiple rows for each element of your array but you can then filter it down.

Can we use array in SQL query?

The ARRAY function returns an ARRAY with one element for each row in a subquery. If subquery produces a SQL table, the table must have exactly one column. Each element in the output ARRAY is the value of the single column of a row in the table.


2 Answers

Below is example for BigQuery Standard SQL
I am using here TEMP TABLE `table_name` to mimic your data with some simplistic dummy data, but you can just remove that CREATE statement and use your own table

#standardSQL
DECLARE FIELDS_TO_CHECK ARRAY<STRING>;
DECLARE i INT64 DEFAULT 0;

CREATE TEMP TABLE `table_name` AS 
  SELECT 1 field1, NULL field2, 3 field3, 4 field4, 5 field5 UNION ALL
  SELECT NULL, NULL, 3, NULL, 5 UNION ALL
  SELECT 1, NULL, 3, 4, 6;

CREATE TEMP TABLE result(field STRING, percentage FLOAT64);  
  
SET FIELDS_TO_CHECK =  ['field1', 'field2', 'field3' ];

LOOP
  SET i = i + 1;
  IF i > ARRAY_LENGTH(FIELDS_TO_CHECK) THEN 
    LEAVE; 
  END IF;
  EXECUTE IMMEDIATE '''
    INSERT result
    SELECT "''' || FIELDS_TO_CHECK[ORDINAL(i)] || '''", COUNT(''' || FIELDS_TO_CHECK[ORDINAL(i)] || ''') / COUNT(*) FROM `table_name`
  ''';

END LOOP; 

SELECT * FROM result;   

Above example returns below output

Row field   percentage   
1   field2  0.0  
2   field1  0.66666666666666663  
3   field3  1.0  
like image 167
Mikhail Berlyant Avatar answered Oct 11 '22 23:10

Mikhail Berlyant


You could use for in to loop through array elements.
The key point is to use UNNEST to fetch element inside array.
The sentence would be a little bit more declarative than use loop with index moving.

DECLARE FIELDS_TO_CHECK ARRAY<STRING>;
SET FIELDS_TO_CHECK =  ['field1', 'field2', 'field3' ];

FOR field IN
  (SELECT * from UNNEST(FIELDS_TO_CHECK))
DO
  # do whatever you want with field
END FOR
like image 28
鄭元傑 Avatar answered Oct 11 '22 22:10

鄭元傑