As it is not possible to update data within a table in BigQuery, and supports only append mechanism, I have decided to create new tables on monthly basis. So suppose for year 2012 the tables would be (tbl_012012
, tbl_022012
, tbl_032012
,...tbl_122012
). Each record will be stored along with date timestamp
as string.
Now, if my application wants to fetch records ranging from Jan 2012(tbl_012012
) to March 2012(tbl_032012
), will BigQuery API automatically traverse through the desired tables via single SQL range query or will I have to write multiple SQL queries with extra application code to retrieve each query result and then aggregate them altogether?
Just separate each table with a comma in the FROM clause to query across all mentioned tables.
A simple SELECT statement is the most basic way to query multiple tables. You can call more than one table in the FROM clause to combine results from multiple tables. Here's an example of how this works: SELECT table1.
From multiple tables To retrieve information from more than one table, you need to join those tables together. This can be done using JOIN methods, or you can use a second SELECT statement inside your main SELECT query—a subquery.
With BigQuery #standardSQL - you can either use standard UNION ALL to go through multiple tables, or you can use a * to match all tables that share the same prefix. When using the * matcher, you will also have access to the meta-column _TABLE_SUFFIX - to know which table the rows came from.
Currently, cached results are not supported for queries against multiple tables using a wildcard even if the Use Cached Results option is checked. If you run the same wildcard query multiple times, you are billed for each query. Wildcard tables support native BigQuery storage only. You cannot use wildcards when querying an external table or a view.
Just separate each table with a comma in the FROM clause to query across all mentioned tables. Show activity on this post.
With BigQuery #standardSQL - you can either use standard UNION ALL to go through multiple tables, or you can use a * to match all tables that share the same prefix. When using the * matcher, you will also have access to the meta-column _TABLE_SUFFIX - to know which table the rows came from. SELECT * FROM Roster UNION ALL SELECT * FROM TeamMascot
You can also use a Table Wildcard Function. Here's one example from the docs for StandardSQL:
SELECT
name
FROM
mydata.people
WHERE
age >= 35
AND
(_TABLE_SUFFIX BETWEEN '20140325' AND '20140327')
And here's a similar example for LegacySQL (docs).
SELECT
name
FROM
(TABLE_DATE_RANGE([mydata.people],
TIMESTAMP('2014-03-25'),
TIMESTAMP('2014-03-27')))
WHERE
age >= 35
This will query the tables:
There are a few other options on the docs. I'd recommend checking them out.
One SQL query can reference multiple tables. Just separate each table with a comma in the FROM clause to query across all mentioned tables.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With