Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Upload to Bigquery from python

I have a Python script that is downloading data from firebase, manipulating it and then dumping it into a JSON file. I can upload it to BigQuery through the command line, but now I want to put some code into the Python script to have it all done in one.

Here is the code I have so far.

import json
from firebase import firebase

firebase = firebase.FirebaseApplication('<redacted>')
result = firebase.get('/connection_info', None)
id_keys = map(str, result.keys())

#with open('result.json', 'r') as w:
 # connection = json.load(w)

with open("w.json", "w") as outfile:
  for id in id_keys:
    json.dump(result[id], outfile, indent=None)
    outfile.write("\n")
like image 856
W. Stephens Avatar asked Dec 13 '22 22:12

W. Stephens


1 Answers

To load a JSON file with the google-cloud-bigquery Python library, use the Client.load_table_from_file() method.

from google.cloud import bigquery

bigquery_client = bigquery.Client()
table_id = 'myproject.mydataset.mytable'

# This example uses JSON, but you can use other formats.
# See https://cloud.google.com/bigquery/loading-data
job_config = bigquery.LoadJobConfig(
    source_format='NEWLINE_DELIMITED_JSON'
)

with open(source_file_name, 'rb') as source_file:
    job = bigquery_client.load_table_from_file(
        source_file, table_id, job_config=job_config
    )

job.result()  # Waits for the job to complete.

From the code example at: https://github.com/googleapis/python-bigquery/blob/9d43d2073dc88140ae69e6778551d140430e410d/samples/load_table_file.py#L19-L41

Edit: the way you upload to a table has change since version 0.28.0 of the Python library. Below is the way to do it in 0.27 and earlier.

To load a JSON file with the google-cloud-bigquery Python library, use the Table.upload_from_file() method.

bigquery_client = bigquery.Client()
dataset = bigquery_client.dataset('mydataset')
table = dataset.table('mytable')

# Reload the table to get the schema.
table.reload()

with open(source_file_name, 'rb') as source_file:
    # This example uses JSON, but you can use other formats.
    # See https://cloud.google.com/bigquery/loading-data
    job = table.upload_from_file(
        source_file, source_format='NEWLINE_DELIMITED_JSON')

From the code example at: https://github.com/GoogleCloudPlatform/python-docs-samples/blob/4de1ac3971d3a94060a1af7f478330b9c40cfb09/bigquery/cloud-client/load_data_from_file.py#L34-L50

like image 54
Tim Swast Avatar answered Dec 18 '22 10:12

Tim Swast