I have a need to (programatically) analyse the details of Scheduled Queries in BigQuery (eg which tables are updated and which tables accessed in the SQL). I have done something similar for the BQ Tables/Views using Apps Script BigQuery.Tables.list()
, but I cannot find an API to access the Scheduled Queries.
The UI is able to list them, so I feel this should be possible programmatically, e.g. via a REST API. Does anyone know if this is possible, what interface is supported (Apps Script, REST ...), and possibly an example of how to use it.
Scheduled queries are part of BigQuery's Data Transfer Service so you have to use its API. In particular, the projects.transferConfigs.list
method. Fill in the dataSourceIds
field with scheduled_query
and parent
with projects/PROJECT_ID
. As discussed in the comments, if you are using a regional location such as europe-west2 instead of a multi-regional one (EU or US) you should use projects.locations.transferConfigs.list
instead. Now, parent resource will be in the form of projects/PROJECT_ID/locations/REGIONAL_LOCATION
.
In addition, for other transfers you can get the corresponding dataSourceIds
using the projects.dataSources.list
method. That's how I got the scheduled_query
one.
Response will be an array of scheduled queries such as:
{
"name": "projects/<PROJECT_NUMBER>/locations/us/transferConfigs/<TRANSFER_CONFIG_ID>",
"destinationDatasetId": "<DATASET>",
"displayName": "hacker-news",
"updateTime": "2018-11-14T15:39:18.897911Z",
"dataSourceId": "scheduled_query",
"schedule": "every 24 hours",
"nextRunTime": "2019-04-19T15:39:00Z",
"params": {
"write_disposition": "WRITE_APPEND",
"query": "SELECT @run_time AS time,\n title,\n author,\n text\nFROM `bigquery-public-data.hacker_news.stories`\nLIMIT\n 1000",
"destination_table_name_template": "hacker_daily_news"
},
"state": "SUCCEEDED",
"userId": "<USER_ID>",
"datasetRegion": "us"
}
Example of an API call with bash and curl
:
#!/bin/bash
# parameter(s)
location=europe-west2
authToken="$(gcloud auth print-access-token)"
projectId=$(gcloud config get-value project 2>\dev\null)
# API call
scheduled_queries=$(curl -H "Authorization: Bearer $authToken" \
https://bigquerydatatransfer.googleapis.com/v1/projects/$projectId/locations/$location/transferConfigs?dataSourceIds=scheduled_query)
# pretty print results
echo $scheduled_queries | python -m json.tool
The above answers are excellent responses for using the REST API. For completeness, I'd like to include the CLI commands approach for solving the same thing. Personally I find this better suited for shell scripts but YMMV.
Example: List of Schedule Queries from Default Project.
bq ls --transfer_config --transfer_location=US --format=prettyjson
Example: Details of a Schedule Query from Default Project.
bq show --format=prettyjson --transfer_config [RESOURCE_NAME]
# RESOURCE_NAME is a value you can get from the above bq ls command.
Further details can be found here.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With