I'm designing a BigQuery job in python that updates and inserts into several tables. I thought of two ways to achieve that:
execute a query job and save the result into a temporary table with an update/insert indicator and process them after. But it's no clear how to update with python libraries.
load the whole data into a new partitioned table and skip updates/inserts. It takes a more space then I would like but partition expires in few days anyway.
Am I missing something? Is there other way to achieve this?
The BigQuery data manipulation language (DML) enables you to update, insert, and delete data from your BigQuery tables. You can execute DML statements just as you would a SELECT statement, with the following conditions: You must use Google Standard SQL.
BigQuery has supported Data Manipulation Language (DML) functionality since 2016 for standard SQL, which enables you to insert, update, and delete rows and columns in your BigQuery datasets.
$dataset = $bigQuery->dataset($datasetId); $table = $dataset->table($tableId); $table->delete();
To append to or overwrite a table using query results, specify a destination table and set the write disposition to either: Append to table — Appends the query results to an existing table. Overwrite table — Overwrites an existing table with the same name using the query results.
You can simply use Data Manipulation Language (DML) statements instead of SQL queries when using the Google BigQuery API.
For instance, in order to update specific rows in the following table:
Inventory
+-------------------+----------+--------------------+
| product | quantity | supply_constrained |
+-------------------+----------+--------------------+
| dishwasher | 30 | NULL |
| dryer | 30 | NULL |
| front load washer | 30 | NULL |
| microwave | 30 | NULL |
+-------------------+----------+--------------------+
you could use the following code:
from google.cloud import bigquery
client = bigquery.Client()
dml_statement = (
"UPDATE dataset.Inventory "
"SET quantity = quantity - 10 "
"WHERE product like '%washer%'")
query_job = client.query(dml_statement) # API request
query_job.result() # Waits for statement to finish
obtaining the following results:
Inventory
+-------------------+----------+--------------------+
| product | quantity | supply_constrained |
+-------------------+----------+--------------------+
| dishwasher | 20 | NULL |
| dryer | 30 | NULL |
| front load washer | 20 | NULL |
| microwave | 30 | NULL |
+-------------------+----------+--------------------+
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With