Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I create a new view in bigquery using the python API?

I have some code that automatically generates a bunch of different SQL queries that I would like to insert into the bigquery to generate views, though one of the issues that I have is that these views need to be generated dynamically every night because of the changing nature of the data. So what I would like to be able to do is use the google bigquery api for python to be able to make a view. I understand how to do it using the 'bq' command line tool, but I'd like to be able to have this built directly into the code as opposed to using a shell to run bq. I have played with the code provided at

https://cloud.google.com/bigquery/bigquery-api-quickstart

I don't understand how to use this bit of code to create a view instead of just returning the results of a SELECT statement. I can see the documentation about doing table inserts here

https://cloud.google.com/bigquery/docs/reference/v2/tables/insert

but that refers to using the REST API to generate new tables as opposed to the example provided above.

Is it just not possible? Should I just give in and use bq?

Thanks

*** Some additional questions in response to Felipe's comments.

The table resource document indicates that there are a number of required fields, some of which make sense even if I don't fully understand what they're asking for, others do not. For example, externalDataConfiguration.schema. Does this refer to the schema for the database that I'm connecting to (I assume it does), or the schema for storing the data?

What about externalDataConfiguration.sourceFormat? Since I'm trying to make a view of a pre-existing database, I'm not sure I understand how the source format is relevant. Is it the source format of the database I'm making a view from? How would I identify that?

ANd externalDataConfiguration.sourceUris[], I'm not importing new data into the database, so I don't understand how this (or the previous element) are required.

What about schema?

tableReference.datasetId, tableReference.projectId, and tableReference.tableId are self explanatory.

Type would be view, and view.query would be the actual sql query used to make the view. So I get why those are required for making a view, but I don't understand the other parts.

Can you help me understand these details?

Thanks, Brad

like image 436
Brad Davis Avatar asked Oct 20 '15 17:10

Brad Davis


People also ask

Can Python connect to BigQuery?

The BigQuery client library for Python is automatically installed in a managed notebook. Behind the scenes, the %%bigquery magic command uses the BigQuery client library for Python to run the given query, convert the results to a pandas DataFrame, optionally save the results to a variable, and then display the results.


2 Answers

Using https://cloud.google.com/bigquery/docs/reference/rest/v2/tables/insert

Submit something like below, assuming you add the authorization

{
  "view": {
    "query": "select column1, count(1) `project.dataset.someTable` group by 1",
    "useLegacySql": false
  },
  "tableReference": {
    "tableId": "viewName",
    "projectId": "projectName",
    "datasetId": "datasetName"
  }
}

Alternatively in Python using, assuming you have a service key setup and the environmental variable GOOGLE_APPLICATION_CREDENTIALS=/path/to/my/key. The one caveat is that as far as I can tell this can only create views using legacy sql, and as an extension can only be queried using legacy sql, though the straight API method allows legacy or standard.

from google.cloud import bigquery

def create_view(dataset_name, view_name, project, viewSQL):

    bigquery_client = bigquery.Client(project=project)

    dataset = bigquery_client.dataset(dataset_name)
    table = dataset.table(view_name)

    table.view_query = viewSQL

    try:
        table.create()
        return True
    except Exception as err:
        print(err)
        return False
like image 179
sbr Avatar answered Sep 30 '22 19:09

sbr


Note: this changed a little bit with 0.28.0 of the library - see the following for further details: Google BigQuery: creating a view via Python google-cloud-bigquery version 0.27.0 vs. 0.28.0

my example function

# create a view via python
def create_view(dataset_name, view_name, sqlQuery, project=None):
    try:

        bigquery_client = bigquery.Client(project=project)
        dataset_ref = bigquery_client.dataset(dataset_name)
        table_ref = dataset_ref.table(view_name)
        table = Table(table_ref)
        table.view_query = sqlQuery
        table.view_use_legacy_sql = False
        bigquery_client.create_table(table)

        return True

    except Exception as e:
        errorStr = 'ERROR (create_view): ' + str(e)
        print(errorStr)
        raise
like image 43
Rich Murnane Avatar answered Sep 30 '22 20:09

Rich Murnane