Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

BigQuery: How to overwrite a table with bigquery.Client().copy_table method

Here is a code I am using as a reference from https://cloud.google.com/bigquery/docs/managing-tables#bigquery-copy-table-python:

source_dataset = client.dataset('samples', project='bigquery-public-data')
source_table_ref = source_dataset.table('shakespeare')

# dataset_id = 'my_dataset'
dest_table_ref = client.dataset(dataset_id).table('destination_table')

job = client.copy_table(
    source_table_ref,
    dest_table_ref,
    # Location must match that of the source and destination tables.
    location='US')  # API request

job.result()  # Waits for job to complete.

In my case the destination table exists and I get this error:

Already Exists

How can I overwrite with this copy_table method? In the bq commandline, I can use -f option. So I am looking for the same flag.

like image 363
kee Avatar asked Oct 12 '18 08:10

kee


People also ask

How do I overwrite existing table in BigQuery?

To append to or overwrite a table using query results, specify a destination table and set the write disposition to either: Append to table — Appends the query results to an existing table. Overwrite table — Overwrites an existing table with the same name using the query results.

How do you overwrite a partition in BigQuery?

You can always over-write a partitioned table in BQ using the postfix of YYYYMMDD in the output table name of your query, along with using WRITE_TRUNCATE as your write disposition (i.e. to truncate whatever is existing in that partition and write new results).

How do you update a table in a large query?

To update a table in BigQuery you can use BigQueryUpdateTableOperator . The update method replaces the entire Table resource, whereas the patch method only replaces fields that are provided in the submitted Table resource.

Can we update table in BigQuery?

The BigQuery data manipulation language (DML) enables you to update, insert, and delete data from your BigQuery tables. You can execute DML statements just as you would a SELECT statement, with the following conditions: You must use Google Standard SQL.


2 Answers

You need to pass the job config to the request like this:

job_config = bigquery.CopyJobConfig()
job_config.write_disposition = "WRITE_TRUNCATE"
job = client.copy_table(
    source_table_ref,
    dest_table_ref,
    location='US',
    job_config=job_config)  # API request

More docs here: https://googleapis.github.io/google-cloud-python/latest/bigquery/generated/google.cloud.bigquery.client.Client.copy_table.html

like image 180
Robert Lacok Avatar answered Oct 20 '22 06:10

Robert Lacok


See the linked docs:

configuration.copy.writeDisposition: Specifies the action that occurs if the destination table already exists.

The following values are supported:

  • WRITE_TRUNCATE: If the table already exists, BigQuery overwrites the table data.
  • WRITE_APPEND: If the table already exists, BigQuery appends the data to the table.
  • WRITE_EMPTY: If the table already exists and contains data, a 'duplicate' error is returned in the job result.

The default value is WRITE_EMPTY.

like image 30
Felipe Hoffa Avatar answered Oct 20 '22 05:10

Felipe Hoffa