Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

BigQuery - Delete rows from Partitioned Table

I have a Day-Partitioned Table on BigQuery. When I try to delete some rows from the table using a query like:

DELETE FROM `MY_DATASET.partitioned_table` WHERE id = 2374180

I get the following error:

Error: DML statements are not yet supported over partitioned tables.

A quick Google search leads me to: https://cloud.google.com/bigquery/docs/loading-data-sql-dml where it also says: "DML statements that modify partitioned tables are not yet supported."

So for now, is there a workaround that we can use in deleting rows from a partitioned table?

like image 592
Kwame Avatar asked Mar 02 '17 17:03

Kwame


1 Answers

DML has some known issues/limitation in this phase.

Such as:

  • DML statements cannot be used to modify tables with REQUIRED fields in their schema.
  • Each DML statement initiates an implicit transaction, which means that changes made by the statement are automatically committed at the end of each successful DML statement. There is no support for multi-statement transactions.
  • The following combinations of DML statements are allowed to run concurrently on a table: UPDATE and INSERT
    DELETE and INSERT
    INSERT and INSERT
    Otherwise one of the DML statements will be aborted. For example, if two UPDATE statements execute simultaneously against the table then only one of them will succeed.
  • Tables that have been written to recently via BigQuery Streaming (tabledata.insertall) cannot be modified using UPDATE or DELETE statements. To check if the table has a streaming buffer, check the tables.get response for a section named streamingBuffer. If it is absent, the table can be modified using UPDATE or DELETE statements.
  • DML statements that modify partitioned tables are not yet supported.

Also be aware of the quota limits

  • Maximum UPDATE/DELETE statements per day per table: 48
  • Maximum UPDATE/DELETE statements per day per project: 500
  • Maximum INSERT statements per day per table: 1,000
  • Maximum INSERT statements per day per project: 10,000

What you can do is copy the entire partition to a non-partitioned table and execute the DML statement there. Than write back the temp table to the partition. Also if you ran into DML update limit statements per day per table, you need to create a copy of the table and run the DML on the new table to avoid the limit.

like image 120
Pentium10 Avatar answered Oct 21 '22 02:10

Pentium10