I'm trying to create something similar to Google's BigQuery dashboard except with predefined queries/views. The problem I'm running into is paginating the data.
The tabledata endpoint supports pagination in that you can specify a start index or use a page token, allowing me to do something like this:
query_reply = table_data_job.list(projectId=settings.PROJECT_ID,
datasetId=settings.DATASET_ID,
tableId=table,
startIndex=offset,
maxResults=page_size).execute()
The problem with this is that I would like to run specific queries (or, at the very least, order the table data results).
query_data = {'query': 'SELECT * FROM my_dataset.foo_table LIMIT %s' % page_size}
query_reply = job_collection.query(projectId=settings.PROJECT_ID,
body=query_data).execute()
To my knowledge, there's no way to do an offset with the above code. Is this just something BigQuery is not suited for? I guess the alternative would be to do the pagination in memory and work on smaller result sets?
BigQuery query results are tables. So you can run a query and get the destination table from the result and then page through the results using the tabledata.list() api. Alternately you can get the job id from the reply and use jobs.getQueryResults(), which has pagination support.
You can achieve pagination with SQL only by using ROW_NUMBER()
Here is generic template
SELECT t.*
FROM (
SELECT
1 AS one,
[field],
ROW_NUMBER() OVER(PARTITION BY one) AS rownum
FROM [table] ) t WHERE rownum BETWEEN X AND Y
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With