I saw this solution as an alternative to materialized views:
But it's using the scheduled queries that run at most every 3 hours. My users are expecting live data, what can I do?
2018-10: BigQuery doesn't support materialized views, but you can use this approach:
Code would look like this:
CREATE OR REPLACE VIEW `wikipedia_vt.just_latest_rows_live` AS
SELECT latest_row.*
FROM (
SELECT ARRAY_AGG(a ORDER BY datehour DESC LIMIT 1)[OFFSET(0)] latest_row
FROM (
SELECT * FROM `fh-bigquery.wikipedia_vt.just_latest_rows`
# previously "materialized" results
UNION ALL
SELECT * FROM `fh-bigquery.wikipedia_v3.pageviews_2018`
# append-only table, source of truth
WHERE datehour > TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 2 DAY )
) a
GROUP BY title
)
Note that BigQuery is able to use TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 2 DAY )
to prune partitions effectively.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With