I have Firestore database with an installed Google BigQuery extension, so all our data ends up in Google's BigQuery.
the structure of our 'raw_changelog' table looks like this.

my queation is related to the data column (which is of type JSON) and the fact I cannot seem to build a Materialized View (MV) on top it.
I'd like to create the following MV;
CREATE OR REPLACE MATERIALIZED VIEW `XXX.XXX.v_data`
PARTITION BY DATE(`published_at`)
CLUSTER BY `event_id`
OPTIONS (
enable_refresh=true,
refresh_interval_minutes=1
)
AS
SELECT
`published_at`,
`event_id`,
ARRAY_AGG(`tags_unnest`.`name` IGNORE NULLS) AS `tags`
FROM `XXX.XXX.XXX_raw_latest` AS `o`
LEFT JOIN UNNEST(JSON_QUERY_ARRAY(`data`.`tags`)) AS `tags_unnest`
WHERE
DATE(`o`.`published_at`) > '2021-01-01'
GROUP BY
`published_at`,
`event_id`
However I'm getting the following error;

I found at that the error is related to line;
ARRAY_AGG(`tags_unnest`.`name` IGNORE NULLS) AS `tags`
I verified that tags_unnest is of type object (coming from the JSON data field) and it seems the issue comess from the fact tags_unnest is an object I guess.
if I replace the above line with;
ARRAY_AGG('aa' IGNORE NULLS) AS `tags`
OR
ARRAY_AGG(`published_at` IGNORE NULLS) AS `tags`
then it works fine and I can go ahead create the MV.
Perhaps also good to note; in the source table, if I change the type of the data field to 'STRING' then my original create-MV-query also runs without any issues..
Can anyone tell me why I cannot create the MV and possibly also a way around my problem?
There is a workaround for the Materialized View: Please replace the LEFT JOIN by a ,
For a Materialized View (MV) there is a limit amount of calculations allowed.
The reference says that Left/right/full outer joins are not allowed in MV and e.g. only one GROUP BY clause operation is allowed.
Your query contains the following 4 task operations:
unnest-ing an json array, containing an objectleft join.nameor .x) andarray_agg.If you use one less operation, it should work. Replace the left join by a ,
Lets build a dummy table:
CREATE OR REPLACE TABLE
Test.ajson AS
SELECT
1 published_at, 1 event_id, json ' {"a":[1],"tags":[{"x":1}]} ' AS DATA, [1,2] arr
UNION ALL SELECT 2, 2, json ' {"a":[2],"tags":[{"x":2},{"x":20}]} ', []
UNION ALL SELECT 1, 1, json ' {"tags":[{"x":10}]} ', []
UNION ALL SELECT 5,5,json "{}", []
Now build the MV:
CREATE OR REPLACE MATERIALIZED VIEW `Test.ajsonview` as
SELECT
published_at,
event_id,
ARRAY_AGG(`tags_unnest`.`x` IGNORE NULLS) AS `tags`,
# ARRAY_AGG(`tags_unnest` IGNORE NULLS) AS testing
FROM Test.ajson
, # works
#left join # throws error: Incremental materialized view query contains unsupported feature.
UNNEST(JSON_QUERY_ARRAY(`data`.`tags`)) tags_unnest
WHERE published_at>0
group by 1,2
Note that the ARRAY_AGG without the .x:
ARRAY_AGG(`tags_unnest` IGNORE NULLS) AS testing
is working in combination with LEFT JOIN.
The Tags column can be empty, therefore the functionality of LEFT JOIN is needed and , does not work directly. There is dirty workaround for that: Duplicate the table with unnest([1,2]) as temp01. In case for temp01=1 we unnest the tags. For temp01=2 a dummy json is added to the dataset [json '{"x":-9877}']. In the array_agg only the temp01=1 tags are keept.
CREATE OR REPLACE MATERIALIZED VIEW `Test.ajsonview` as
SELECT
published_at,
event_id,
ARRAY_AGG(if(temp01=1,`tags_unnest`.`x`,null) IGNORE NULLS) AS `tags`,
FROM Test.ajson,
unnest([1,2]) as temp01,
UNNEST(if(1=temp01,JSON_QUERY_ARRAY(`data`.`tags`),[json '{"x":-9877}'])) tags_unnest
WHERE published_at>0
group by 1,2
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With