Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Splitting a string column in BigQuery

Tags:

Let's say I have a table in BigQuery containing 2 columns. The first column represents a name, and the second is a delimited list of values, of arbitrary length. Example:

Name | Scores -----+------- Bob  |10;20;20 Sue  |14;12;19;90 Joe  |30;15 

I want to transform into columns where the first is the name, and the second is a single score value, like so:

Name,Score Bob,10 Bob,20 Bob,20 Sue,14 Sue,12 Sue,19 Sue,90 Joe,30 Joe,15 

Can this be done in BigQuery alone?

like image 373
David M Smith Avatar asked Oct 16 '13 21:10

David M Smith


People also ask

How do I split text in BigQuery?

Using SPLIT(value[, delimiter]) returns an array. Then using SAFE_OFFSET(zero_based_offset) or SAFE_ORDINAL(one_based_offset) to get item from array. SELECT SPLIT(app_info.

How do you concatenate strings in BigQuery?

Note: You can also use the || concatenation operator to concatenate values into a string.

How do you use Unnest in BigQuery?

To convert an ARRAY into a set of rows, also known as "flattening," use the UNNEST operator. UNNEST takes an ARRAY and returns a table with a single row for each element in the ARRAY . Because UNNEST destroys the order of the ARRAY elements, you may wish to restore order to the table.

How do I concatenate two columns in BigQuery?

The BigQuery CONCAT function allows you to combine (concatenate) one more values into a single result. Alternatively, you can use the concatenation operator || to achieve the same output.


2 Answers

Good news everyone! BigQuery can now SPLIT()!


Look at "find all two word phrases that appear in more than one row in a dataset".

There is no current way to split() a value in BigQuery to generate multiple rows from a string, but you could use a regular expression to look for the commas and find the first value. Then run a similar query to find the 2nd value, and so on. They can all be merged into only one query, using the pattern presented in the above example (UNION through commas).

like image 56
Felipe Hoffa Avatar answered Oct 04 '22 19:10

Felipe Hoffa


Trying to rewrite Elad Ben Akoune's answer in Standart SQL, the query becomes like this;

WITH name_score AS ( SELECT Name, split(Scores,';') AS Score FROM (       (SELECT * FROM (SELECT 'Bob' AS Name ,'10;20;20' AS Scores))        UNION ALL        (SELECT * FROM (SELECT 'Sue' AS Name ,'14;12;19;90' AS Scores))       UNION ALL       (SELECT * FROM (SELECT 'Joe' AS Name ,'30;15' AS Scores)) ))  SELECT name, score FROM name_score CROSS JOIN UNNEST(name_score.score) AS score; 

And this outputs;

+------+-------+ | name | score | +------+-------+ | Bob  | 10    | | Bob  | 20    | | Bob  | 20    | | Sue  | 14    | | Sue  | 12    | | Sue  | 19    | | Sue  | 90    | | Joe  | 30    | | Joe  | 15    | +------+-------+ 
like image 29
az3 Avatar answered Oct 04 '22 19:10

az3