Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is there an equivalent of table wildcard functions in BigQuery with standard SQL?

In legacy SQL, users can use table wildcard functions like TABLE_DATE_RANGE, TABLE_QUERY and TABLE_DATE_RANGE_STRICT.

Is there a similar feature with standard SQL?

like image 327
Pavan Edara Avatar asked Jun 06 '16 18:06

Pavan Edara


1 Answers

In legacy SQL, users can reference data from a subset of tables in a dataset using table wildcard functions. In standard SQL, users can achieve the same result using UNION ALL. However, this approach may not be convenient when users want to dynamically determine the set of tables using, for example, either a date range (supported using TABLE_DATE_RANGE and TABLE_DATE_RANGE_STRICT in legacy SQL) or other complex criteria (supported by TABLE_QUERY in legacy SQL). With Standard SQL, BigQuery offers an equivalent to this described below.

The following legacy SQL query that uses the TABLE_QUERY wildcard function can be rewritten using standard SQL.

Legacy SQL query (using TABLE_QUERY):

SELECT SUM(value1)
FROM TABLE_QUERY([myproject:mydataset],"table_id = 'mydailytable_20150105' OR
table_id = 'mydailytable_20150106' OR table_id = 'maydailytable_20150110'")
GROUP BY value2;

Legacy SQL query (using TABLE_DATE_RANGE):

SELECT SUM(value1)
FROM TABLE_DATE_RANGE([myproject:mydataset], TIMESTAMP("2015-01-05"), TIMESTAMP("2015-01-10"))

Standard SQL query:

SELECT SUM(value1)
FROM `myproject.mydataset.mydailytable_*`
WHERE _TABLE_SUFFIX = '20150105'
  OR _TABLE_SUFFIX = '20150106'
  OR _TABLE_SUFFIX = '20150110'
GROUP BY value2;

In the above query, the wildcard table myproject.mydataset.mydailytable_* matches all tables in the dataset myproject.mydataset that have table_id starting with mydailytable_. For example, to match all tables in the dataset the user can use an empty prefix for the wildcard. So, myproject.mydataset.* matches all tables in the dataset.

Since * is a special character, wildcard table names must be quoted when using them in a query.

The _TABLE_SUFFIX pseudo column:

The _TABLE_SUFFIX pseudo column has type STRING and can be used just like any other column. It is a reserved column name, so it needs to be aliased when using it as part of the SELECT list.

Official documentation for this feature is available here:

https://cloud.google.com/bigquery/docs/wildcard-tables https://cloud.google.com/bigquery/docs/querying-wildcard-tables

like image 152
Pavan Edara Avatar answered Oct 19 '22 21:10

Pavan Edara