Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Generic comparison method for two tables in BigQuery if tables contain STRUCT type

I'm looking for a generic method to compare two tables in BigQuery, even if they have columns that are STRUCT type.

It should work for any pair of tables, and ideally wouldn't involve writing a query that depends on that actual columns of the tables. All I really need to know is if the tables are equal or not, but it would be a bonus if it could show me the difference between the rows that aren't the same.

So something like (in pseudo code)

sizeOf( TABLE A EXCEPT TABLE B ) == 0

or

Hash(TABLE A) == HASH(TABLE B)

Would be fine.

I tried using this:

(   SELECT * FROM table1
    EXCEPT DISTINCT
    SELECT * FROM table2)  
UNION ALL
(   SELECT * FROM table2
    EXCEPT DISTINCT
    SELECT * FROM table1) 

But I got this error.

Column 1 in EXCEPT ALL has type that does not support set operation comparisons: STRUCT at [3:5]

Does anyone know of a way to get around this?

EDIT

Should have mentioned before, but I need this to work regardless of the ordering of the rows of the table.

like image 490
Ryan Stull Avatar asked Oct 16 '25 02:10

Ryan Stull


1 Answers

I think yo are looking for something like below to start with

#standardSQL
SELECT TO_JSON_STRING(a) FROM `project.dataset.tableA` a
EXCEPT DISTINCT
SELECT TO_JSON_STRING(b) FROM `project.dataset.tableB` b

Or, more complete example - to show differences - note: this can be quite exhausting output for really different tables

#standardSQL
SELECT 'a' table, * FROM (
  SELECT TO_JSON_STRING(a) record FROM `project.dataset.tableA` a
  EXCEPT DISTINCT
  SELECT TO_JSON_STRING(b) FROM `project.dataset.tableB` b
)
UNION ALL
SELECT 'b', * FROM (
  SELECT TO_JSON_STRING(b) FROM `project.dataset.tableB` b
  EXCEPT DISTINCT
  SELECT TO_JSON_STRING(a) FROM `project.dataset.tableA` a
)
like image 110
Mikhail Berlyant Avatar answered Oct 18 '25 18:10

Mikhail Berlyant