Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

how to update/delete rows in Bigquery from the python api?

I'm designing a BigQuery job in python that updates and inserts into several tables. I thought of two ways to achieve that:

  1. execute a query job and save the result into a temporary table with an update/insert indicator and process them after. But it's no clear how to update with python libraries.

  2. load the whole data into a new partitioned table and skip updates/inserts. It takes a more space then I would like but partition expires in few days anyway.

Am I missing something? Is there other way to achieve this?

like image 841
Manuel Valero Avatar asked Feb 08 '18 11:02

Manuel Valero


People also ask

Can you update rows in BigQuery?

The BigQuery data manipulation language (DML) enables you to update, insert, and delete data from your BigQuery tables. You can execute DML statements just as you would a SELECT statement, with the following conditions: You must use Google Standard SQL.

Can you delete rows from BigQuery?

BigQuery has supported Data Manipulation Language (DML) functionality since 2016 for standard SQL, which enables you to insert, update, and delete rows and columns in your BigQuery datasets.

How do you delete a table in Python using BigQuery?

$dataset = $bigQuery->dataset($datasetId); $table = $dataset->table($tableId); $table->delete();

How do you overwrite data in BigQuery?

To append to or overwrite a table using query results, specify a destination table and set the write disposition to either: Append to table — Appends the query results to an existing table. Overwrite table — Overwrites an existing table with the same name using the query results.


1 Answers

You can simply use Data Manipulation Language (DML) statements instead of SQL queries when using the Google BigQuery API.

For instance, in order to update specific rows in the following table:

Inventory
+-------------------+----------+--------------------+
|      product      | quantity | supply_constrained |
+-------------------+----------+--------------------+
| dishwasher        |       30 |               NULL |
| dryer             |       30 |               NULL |
| front load washer |       30 |               NULL |
| microwave         |       30 |               NULL |
+-------------------+----------+--------------------+

you could use the following code:

from google.cloud import bigquery

client = bigquery.Client()

dml_statement = (
    "UPDATE dataset.Inventory "
    "SET quantity = quantity - 10 "
    "WHERE product like '%washer%'")
query_job = client.query(dml_statement)  # API request
query_job.result()  # Waits for statement to finish

obtaining the following results:

Inventory
+-------------------+----------+--------------------+
|      product      | quantity | supply_constrained |
+-------------------+----------+--------------------+
| dishwasher        |       20 |               NULL |
| dryer             |       30 |               NULL |
| front load washer |       20 |               NULL |
| microwave         |       30 |               NULL |
+-------------------+----------+--------------------+
like image 133
vreyespue Avatar answered Oct 27 '22 22:10

vreyespue