Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Using TABLE_DATE_RANGE with more than 1 year's worth of tables

I'm trying to use google bigquery to select data from tables with a date wildcard. I'd like to be able to use the TABLE_DATE_RANGE function, but I need to query over a large date range (> 1 year). Right now my query works for a year's worth of data, but anything over I get:

Error: TABLE_DATE_RANGE error: too many days

#Fails
SELECT 
  system_id, sample_date, e_pv_array
FROM 
  (TABLE_DATE_RANGE(workspace.intervaldata, 
                    TIMESTAMP('2009-03-01'), 
                    TIMESTAMP('2010-03-04')))
WHERE
  system_id = 20006
  and e_pv_array is not null;

#Works
SELECT 
  system_id, sample_date, e_pv_array
FROM 
  (TABLE_DATE_RANGE(workspace.intervaldata, 
                    TIMESTAMP('2009-03-01'), 
                    TIMESTAMP('2010-03-03')))
WHERE
  system_id = 20006
  and e_pv_array is not null;

Is this just a limitation of bigquery? Or is there a way to use table wildcards with date ranges greater than 1 year?

like image 892
Nathan S. Avatar asked Jun 12 '14 18:06

Nathan S.


People also ask

How can I get data from multiple tables in BigQuery?

Notifications Stay organized with collections Save this page to your Developer Profile to get notifications on important updates. Save and categorize content based on your preferences. Wildcard tables enable you to query multiple tables using concise SQL statements.

How do you use a union in a large query?

Union in BigQuery syntaxSELECT column name, column name... FROM table1 UNION (Distinct / All) SELECT column name, column name….. FROM table2; SELECT column name, column name...

What is the system hierarchy in BigQuery?

BigQuery, like other Google Cloud resources, is organized hierarchically where the Organization node is the root node, the Projects are the children of the Organization, and Datasets are descendants of Projects.


1 Answers

I've experienced problems with TABLE_DATE_RANGE() as well. Here's a possible workaround, assuming your tables are named workspace.intervaldata20090303, etc.:

SELECT 
system_id, sample_date, e_pv_array
FROM 
TABLE_QUERY(workspace, "integer(regexp_extract(table_id, r'intervaldata([0-9]+)')) BETWEEN 20090301 and 20100304")
WHERE
system_id = 20006
and e_pv_array is not null;

I have a similar scenario (date partitioned tables) and am able to run queries over 2 years' worth of data using TABLE_QUERY() without errors.

like image 80
David M Smith Avatar answered Sep 30 '22 12:09

David M Smith