Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Exporting BigQuery Table Data to Google Cloud Storage having where clause using python

I want to export table data from BigQuery to Google Cloud Storage. Problem is, I need data from date1 to date2 and not whole table data.

extract_job = client.extract_table(
    table_ref,
    destination_uri,
    # Location must match that of the source table.
    location='US')  # API request
extract_job.result()  

This is what I have found on the google cloud help. There is no space for adding query or limiting data using where clause.

like image 432
Saurav Agarwal Avatar asked Jun 11 '18 09:06

Saurav Agarwal


2 Answers

Unfortunately it will be two step process. First you need to build result table and after export result. From cost perspective impact should be minimal - you will pay for storage used by temp table with result but cost is $0.02 per GB per month - so if you manage to finish you task in 1 hour - cost will be $0.000027 per GB

job_config = bigquery.QueryJobConfig()
gcs_filename = 'file_*.gzip'

table_ref = client.dataset(dataset_id).table('my_temp_table')
job_config.destination = table_ref

job_config.write_disposition = bigquery.WriteDisposition.WRITE_TRUNCATE

# Start the query, passing in the extra configuration.
query_job = client.query(
    """#standardSql
    select * from `project.dataset.table` where <your_condition> ;""",
    location='US',
    job_config=job_config)

while not query_job.done():
    time.sleep(1)

#check if table successfully written
print("query completed")
job_config = bigquery.ExtractJobConfig()
job_config.compression = bigquery.Compression.GZIP
job_config.destination_format = (
    bigquery.DestinationFormat.CSV)
job_config.print_header = False

destination_uri = 'gs://{}/{}'.format(bucket_name, gcs_filename)

extract_job = client.extract_table(
    table_ref,
    destination_uri,
    job_config=job_config,
    location='US')  # API request
extract_job.result()
print("extract completed")
like image 150
Alexey Maloletkin Avatar answered Oct 23 '22 07:10

Alexey Maloletkin


Using the code you provided (following this doc), you can only export the whole table to GCS, not the result of a query.

Alternatively, you can download and save your query result to a local file and upload it to GCS. Or even easier, save the query result to a new BigQuery table and export that new table entirely to GCS with the code you used.

like image 44
LundinCast Avatar answered Oct 23 '22 08:10

LundinCast