I would like to run SQL queries to BigQuery using Python, I am a complete beginner at this. I have read the 'Create A Simple Application With the API' page (https://cloud.google.com/bigquery/create-simple-app-api#bigquery-simple-app-build-service-python) and have got my code as follows:
from google.cloud import bigquery
client = bigquery.Client()
query_job = client.query("""
#standardSQL
SELECT date, totals.visits AS visits
FROM `myproject.mydataset.ga_sessions_20180111`
GROUP BY date
""")
results = query_job.result() # Waits for job to complete.
for row in results:
print("{}: {}".format(row.title, row.unique_words))
When I run this I get the error: OSError: Project was not passed and could not be determined from the environment.
Reading up on this I think the issue relates to the authentication of client = bigquery.Client()
- can somebody explain to me in simple terms how this works? Does it look for my authentication details if I am already logged in? If I have permission for multiple projects do I need to specify which one I am working with?
In order to authenticate to any GCP API it's recommended to use a service account credential, the docs will teach you how to create and download one.
After this step, you should have a json file that looks like:
{
"type": "service_account",
"project_id": "your project",
"private_key_id": "your private key id",
"private_key": "private key",
"client_email": "email",
"client_id": "client id",
"auth_uri": "https://accounts.google.com/o/oauth2/auth",
"token_uri": "https://accounts.google.com/o/oauth2/token",
"auth_provider_x509_cert_url": "https://www.googleapis.com/oauth2/v1/certs",
"client_x509_cert_url": "https://www.googleapis.com/robot/v1/metadata/x509/email_id"
}
After that, you can either export the file path to a env
variable in the OS, like so:
export GOOGLE_APPLICATION_CREDENTIALS=/path/to/key.json
Or you can in your own script build the Client using the json file directly:
import google.cloud.bigquery as bq
client = bq.Client.from_service_account_json("path/to/key.json")
The project_id
will be handled automatically for you as well (given the project you created the json file).
You asked about using your own user credentials, I'm not sure how to authenticate using those but still this is not recommended, you'd have to manage google.auth and manually build the OAuth2 steps, all of that is already automatically done for you in service account.
Right, you need to specify the project you will be working with. Per instructions in this Google Colab notebook: https://colab.research.google.com/notebooks/bigquery.ipynb:
You can declare your Google Cloud Platform project ID:
project_id = '[your project ID]'
Then, you can just add this variable to the client object creation:
client = bigquery.Client(project=project_id)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With