Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Running a BigQuery SQL query in Python, how to authenticate?

I would like to run SQL queries to BigQuery using Python, I am a complete beginner at this. I have read the 'Create A Simple Application With the API' page (https://cloud.google.com/bigquery/create-simple-app-api#bigquery-simple-app-build-service-python) and have got my code as follows:

from google.cloud import bigquery

client = bigquery.Client()

query_job = client.query("""
    #standardSQL
    SELECT date, totals.visits AS visits
    FROM `myproject.mydataset.ga_sessions_20180111`
    GROUP BY date
    """)

results = query_job.result()  # Waits for job to complete.

for row in results:
    print("{}: {}".format(row.title, row.unique_words))

When I run this I get the error: OSError: Project was not passed and could not be determined from the environment.

Reading up on this I think the issue relates to the authentication of client = bigquery.Client() - can somebody explain to me in simple terms how this works? Does it look for my authentication details if I am already logged in? If I have permission for multiple projects do I need to specify which one I am working with?

like image 274
Ben P Avatar asked Dec 24 '22 11:12

Ben P


2 Answers

In order to authenticate to any GCP API it's recommended to use a service account credential, the docs will teach you how to create and download one.

After this step, you should have a json file that looks like:

{
 "type": "service_account",
 "project_id": "your project",
 "private_key_id": "your private key id",
 "private_key": "private key",
 "client_email": "email",
 "client_id": "client id",
 "auth_uri": "https://accounts.google.com/o/oauth2/auth",
 "token_uri": "https://accounts.google.com/o/oauth2/token",
 "auth_provider_x509_cert_url": "https://www.googleapis.com/oauth2/v1/certs",
 "client_x509_cert_url":  "https://www.googleapis.com/robot/v1/metadata/x509/email_id"
}

After that, you can either export the file path to a env variable in the OS, like so:

export GOOGLE_APPLICATION_CREDENTIALS=/path/to/key.json

Or you can in your own script build the Client using the json file directly:

import google.cloud.bigquery as bq
client = bq.Client.from_service_account_json("path/to/key.json")

The project_id will be handled automatically for you as well (given the project you created the json file).

You asked about using your own user credentials, I'm not sure how to authenticate using those but still this is not recommended, you'd have to manage google.auth and manually build the OAuth2 steps, all of that is already automatically done for you in service account.

like image 50
Willian Fuks Avatar answered Dec 28 '22 07:12

Willian Fuks


Right, you need to specify the project you will be working with. Per instructions in this Google Colab notebook: https://colab.research.google.com/notebooks/bigquery.ipynb:

You can declare your Google Cloud Platform project ID: project_id = '[your project ID]'

Then, you can just add this variable to the client object creation: client = bigquery.Client(project=project_id)

like image 44
AlexK Avatar answered Dec 28 '22 06:12

AlexK