Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Paginating BigQuery

I'm trying to create something similar to Google's BigQuery dashboard except with predefined queries/views. The problem I'm running into is paginating the data.

The tabledata endpoint supports pagination in that you can specify a start index or use a page token, allowing me to do something like this:

query_reply = table_data_job.list(projectId=settings.PROJECT_ID,
                                  datasetId=settings.DATASET_ID,
                                  tableId=table,
                                  startIndex=offset,
                                  maxResults=page_size).execute()

The problem with this is that I would like to run specific queries (or, at the very least, order the table data results).

query_data = {'query': 'SELECT * FROM my_dataset.foo_table LIMIT %s' % page_size}
query_reply = job_collection.query(projectId=settings.PROJECT_ID,
                                   body=query_data).execute()

To my knowledge, there's no way to do an offset with the above code. Is this just something BigQuery is not suited for? I guess the alternative would be to do the pagination in memory and work on smaller result sets?

like image 471
Tyler Treat Avatar asked Jun 05 '13 16:06

Tyler Treat


2 Answers

BigQuery query results are tables. So you can run a query and get the destination table from the result and then page through the results using the tabledata.list() api. Alternately you can get the job id from the reply and use jobs.getQueryResults(), which has pagination support.

like image 174
Jordan Tigani Avatar answered Oct 12 '22 23:10

Jordan Tigani


You can achieve pagination with SQL only by using ROW_NUMBER()

Here is generic template

SELECT  t.*
FROM (
SELECT 
1 AS one, 
[field], 
ROW_NUMBER() OVER(PARTITION BY one) AS rownum
FROM [table] ) t WHERE rownum BETWEEN X AND Y
like image 37
Adrian Avatar answered Oct 13 '22 00:10

Adrian