Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Querying multiple tables in Big Query

As it is not possible to update data within a table in BigQuery, and supports only append mechanism, I have decided to create new tables on monthly basis. So suppose for year 2012 the tables would be (tbl_012012, tbl_022012, tbl_032012,...tbl_122012). Each record will be stored along with date timestamp as string.

Now, if my application wants to fetch records ranging from Jan 2012(tbl_012012) to March 2012(tbl_032012), will BigQuery API automatically traverse through the desired tables via single SQL range query or will I have to write multiple SQL queries with extra application code to retrieve each query result and then aggregate them altogether?

like image 461
Pratik Borkar Avatar asked Nov 01 '12 06:11

Pratik Borkar


People also ask

How do I query multiple tables in BigQuery?

Just separate each table with a comma in the FROM clause to query across all mentioned tables.

Can you query from multiple tables?

A simple SELECT statement is the most basic way to query multiple tables. You can call more than one table in the FROM clause to combine results from multiple tables. Here's an example of how this works: SELECT table1.

Can we fetch data from multiple tables using one query?

From multiple tables To retrieve information from more than one table, you need to join those tables together. This can be done using JOIN methods, or you can use a second SELECT statement inside your main SELECT query—a subquery.

How do I match multiple tables in BigQuery?

With BigQuery #standardSQL - you can either use standard UNION ALL to go through multiple tables, or you can use a * to match all tables that share the same prefix. When using the * matcher, you will also have access to the meta-column _TABLE_SUFFIX - to know which table the rows came from.

Can I use a wildcard with multiple tables in BigQuery?

Currently, cached results are not supported for queries against multiple tables using a wildcard even if the Use Cached Results option is checked. If you run the same wildcard query multiple times, you are billed for each query. Wildcard tables support native BigQuery storage only. You cannot use wildcards when querying an external table or a view.

How to query across multiple tables in SQL Server?

Just separate each table with a comma in the FROM clause to query across all mentioned tables. Show activity on this post.

How do I use Union all in BigQuery?

With BigQuery #standardSQL - you can either use standard UNION ALL to go through multiple tables, or you can use a * to match all tables that share the same prefix. When using the * matcher, you will also have access to the meta-column _TABLE_SUFFIX - to know which table the rows came from. SELECT * FROM Roster UNION ALL SELECT * FROM TeamMascot


2 Answers

You can also use a Table Wildcard Function. Here's one example from the docs for StandardSQL:

SELECT 
  name
FROM 
  mydata.people
WHERE 
  age >= 35
  AND
  (_TABLE_SUFFIX BETWEEN '20140325' AND '20140327')

And here's a similar example for LegacySQL (docs).

SELECT 
  name
FROM 
  (TABLE_DATE_RANGE([mydata.people], 
                TIMESTAMP('2014-03-25'), 
                TIMESTAMP('2014-03-27'))) 
WHERE 
  age >= 35

This will query the tables:

  • mydata.people20140325
  • mydata.people20140326
  • mydata.people20140327

There are a few other options on the docs. I'd recommend checking them out.

like image 106
Eduardo Avatar answered Oct 06 '22 00:10

Eduardo


One SQL query can reference multiple tables. Just separate each table with a comma in the FROM clause to query across all mentioned tables.

like image 32
Ryan Boyd Avatar answered Oct 05 '22 23:10

Ryan Boyd