Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to get intersection of two arrays in BigQuery

I have data like:

id  col1     col2
-----------------
1   [1,2]    [2,3]
2   [4,4,6]  [6,7]

and I want to have data like:

id  col3
---------
1   [2]
2   [6]

Any smart solutions for this?

like image 753
Keisuke Nagakawa 永川 圭介 Avatar asked Aug 08 '18 11:08

Keisuke Nagakawa 永川 圭介


People also ask

What is Unnest function in BigQuery?

To convert an ARRAY into a set of rows, also known as "flattening," use the UNNEST operator. UNNEST takes an ARRAY and returns a table with a single row for each element in the ARRAY . Because UNNEST destroys the order of the ARRAY elements, you may wish to restore order to the table.

How do you offset in BigQuery?

OFFSET means that the numbering starts at zero, ORDINAL means that the numbering starts at one. A given array can be interpreted as either 0-based or 1-based. When accessing an array element, you must preface the array position with OFFSET or ORDINAL , respectively; there is no default behavior.

What are arrays and how are they used in BigQuery?

What are Arrays and how are they used in BigQuery: Arrays in BigQuery, like in any other language, are a collection of elements of the same data type. For example, this is what an Array address_history might look like: id:”1",

How do you find the intersection of two arrays?

The intersection of the two arrays results in those elements that are contained in both of them. If an element is only in one of the arrays, it is not available in the intersection. An example of this is given as follows −. Array 1 = 1 2 5 8 9 Array 2 = 2 4 5 9 Intersection = 2 5 9.

How do I combine array literals in BigQuery?

With BigQuery, you can construct array literals, build arrays from subqueries using the ARRAY function, and aggregate values into an array using the ARRAY_AGG function. You can combine arrays using functions like ARRAY_CONCAT (), and convert arrays to strings using ARRAY_TO_STRING ().

How to find intersection of 2 sorted arrays in R?

Using Thanks to Rajat Rawat for suggesting this solution. To find intersection of 2 sorted arrays, follow the below approach : 2) If arr1 [i] is smaller than arr2 [j] then increment i. 3) If arr1 [i] is greater than arr2 [j] then increment j. 4) If both are same then print any of them and increment both i and j.


2 Answers

You can use INTERSECT DISTINCT

-- build example table
WITH example as (
  SELECT
  * FROM UNNEST([
      STRUCT([1,2] as col1, [2,3] as col2),
      STRUCT([4,4,6],[6,7])
    ])
  )

-- INTERSECT per row on two arrays
SELECT
  ARRAY(SELECT * FROM example.col1
    INTERSECT DISTINCT
    (SELECT * FROM example.col2)
  ) AS result
FROM example
like image 154
Martin Weitzmann Avatar answered Oct 24 '22 00:10

Martin Weitzmann


Sorry, I solved by myself:

#standardSQL
CREATE TEMPORARY FUNCTION intersection(x ARRAY<INT64>, y ARRAY<INT64>)
RETURNS INT64
LANGUAGE js AS """
  var res =  x.filter(value => -1 !== y.indexOf(value));
  return res;
;
""";

Any other smarter idea is welcome! Thanks.

like image 20
Keisuke Nagakawa 永川 圭介 Avatar answered Oct 23 '22 23:10

Keisuke Nagakawa 永川 圭介