Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to run a BigQuery query in Python

This is the query that I have been running in BigQuery that I want to run in my python script. How would I change this/ what do I have to add for it to run in Python.

#standardSQL
SELECT
  Serial,
  MAX(createdAt) AS Latest_Use,
  SUM(ConnectionTime/3600) as Total_Hours,
  COUNT(DISTINCT DeviceID) AS Devices_Connected
FROM `dataworks-356fa.FirebaseArchive.testf`
WHERE Model = "BlueBox-pH"
GROUP BY Serial
ORDER BY Serial
LIMIT 1000;

From what I have been researching it is saying that I cant save this query as a permanent table using Python. Is that true? and if it is true is it possible to still export a temporary table?

like image 835
W. Stephens Avatar asked Jul 10 '17 04:07

W. Stephens


Video Answer


3 Answers

Here is another way using a JSON file for the service account:

>>> from google.cloud import bigquery
>>>
>>> CREDS = 'test_service_account.json'
>>> client = bigquery.Client.from_service_account_json(json_credentials_path=CREDS)
>>> job = client.query('select * from dataset1.mytable')
>>> for row in job.result():
...     print(r)
like image 164
Aziz Alto Avatar answered Oct 11 '22 16:10

Aziz Alto


You need to use the BigQuery Python client lib, then something like this should get you up and running:

from google.cloud import bigquery
client = bigquery.Client(project='PROJECT_ID')
query = "SELECT...."
dataset = client.dataset('dataset')
table = dataset.table(name='table')
job = client.run_async_query('my-job', query)
job.destination = table
job.write_disposition= 'WRITE_TRUNCATE'
job.begin()

https://googlecloudplatform.github.io/google-cloud-python/stable/bigquery-usage.html

See the current BigQuery Python client tutorial.

like image 39
Graham Polley Avatar answered Oct 11 '22 15:10

Graham Polley


This is a good usage guide: https://googleapis.github.io/google-cloud-python/latest/bigquery/usage/index.html

To simply run and write a query:

# from google.cloud import bigquery
# client = bigquery.Client()
# dataset_id = 'your_dataset_id'

job_config = bigquery.QueryJobConfig()
# Set the destination table
table_ref = client.dataset(dataset_id).table("your_table_id")
job_config.destination = table_ref
sql = """
    SELECT corpus
    FROM `bigquery-public-data.samples.shakespeare`
    GROUP BY corpus;
"""

# Start the query, passing in the extra configuration.
query_job = client.query(
    sql,
    # Location must match that of the dataset(s) referenced in the query
    # and of the destination table.
    location="US",
    job_config=job_config,
)  # API request - starts the query

query_job.result()  # Waits for the query to finish
print("Query results loaded to table {}".format(table_ref.path))
like image 20
Eric He Avatar answered Oct 11 '22 15:10

Eric He