I have a Python script that is downloading data from firebase, manipulating it and then dumping it into a JSON file. I can upload it to BigQuery through the command line, but now I want to put some code into the Python script to have it all done in one.
Here is the code I have so far.
import json
from firebase import firebase
firebase = firebase.FirebaseApplication('<redacted>')
result = firebase.get('/connection_info', None)
id_keys = map(str, result.keys())
#with open('result.json', 'r') as w:
 # connection = json.load(w)
with open("w.json", "w") as outfile:
  for id in id_keys:
    json.dump(result[id], outfile, indent=None)
    outfile.write("\n")
To load a JSON file with the google-cloud-bigquery Python library, use the Client.load_table_from_file() method.
from google.cloud import bigquery
bigquery_client = bigquery.Client()
table_id = 'myproject.mydataset.mytable'
# This example uses JSON, but you can use other formats.
# See https://cloud.google.com/bigquery/loading-data
job_config = bigquery.LoadJobConfig(
    source_format='NEWLINE_DELIMITED_JSON'
)
with open(source_file_name, 'rb') as source_file:
    job = bigquery_client.load_table_from_file(
        source_file, table_id, job_config=job_config
    )
job.result()  # Waits for the job to complete.
From the code example at: https://github.com/googleapis/python-bigquery/blob/9d43d2073dc88140ae69e6778551d140430e410d/samples/load_table_file.py#L19-L41
Edit: the way you upload to a table has change since version 0.28.0 of the Python library. Below is the way to do it in 0.27 and earlier.
To load a JSON file with the google-cloud-bigquery Python library, use the Table.upload_from_file() method.
bigquery_client = bigquery.Client()
dataset = bigquery_client.dataset('mydataset')
table = dataset.table('mytable')
# Reload the table to get the schema.
table.reload()
with open(source_file_name, 'rb') as source_file:
    # This example uses JSON, but you can use other formats.
    # See https://cloud.google.com/bigquery/loading-data
    job = table.upload_from_file(
        source_file, source_format='NEWLINE_DELIMITED_JSON')
From the code example at: https://github.com/GoogleCloudPlatform/python-docs-samples/blob/4de1ac3971d3a94060a1af7f478330b9c40cfb09/bigquery/cloud-client/load_data_from_file.py#L34-L50
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With