Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Exporting query results as JSON via Google BigQuery API

I've got jobs/queries that return a few hundred thousand rows. I'd like to get the results of the query and write them as json in a storage bucket.

Is there any straightforward way of doing this? Right now the only method I can think of is:

  • set allowLargeResults to true
  • set a randomly named destination table to hold the query output
  • create a 2nd job to extract the data in the "temporary" destination table to a file in a storage bucket
  • delete the random "temporary" table.

This just seems a bit messy and roundabout. I'm going to be wrapping all this in a service hooked up to a UI that would have lots of users hitting it and would rather not be in the business of managing all these temporary tables.

like image 903
NoCarrier Avatar asked Oct 26 '15 23:10

NoCarrier


1 Answers

1) As you mention the steps are good. You need to use Google Cloud Storage for your export job. Exporting data from BigQuery is explained here, check also the variants for different path syntax.

Then you can download the files from GCS to your local storage.

Gsutil tool can help you further to download the file from GCS to local machine.

With this approach you first need to export to GCS, then to transfer to local machine. If you have a message queue system (like Beanstalkd) in place to drive all these it's easy to do a chain of operation: submit jobs, monitor state of the job, when done initiate export to GCS, then delete the temp table.

Please also know that you can update a table via the API and set the expirationTime property, with this aproach you don't need to delete it.

2) If you use the BQ Cli tool, then you can set output format to JSON, and you can redirect to a file. This way you can achieve some export locally, but it has certain other limits.

this exports the first 1000 line as JSON

bq --format=prettyjson query --n=1000 "SELECT * from publicdata:samples.shakespeare" > export.json
like image 118
Pentium10 Avatar answered Oct 13 '22 21:10

Pentium10