I'm getting this following error when trying to delete records from a table created through GCP Console and updated with GCP BigQuery Node.js table insert function.
UPDATE or DELETE DML statements are not supported over table stackdriver-360-150317:my_dataset.users with streaming buffer
The table was created without streaming features. And from what I'm reading in documentation Tables that have been written to recently via BigQuery Streaming (tabledata.insertall) cannot be modified using UPDATE or DELETE statements
.
Does it mean that once a record has been inserted with this function into a table, there's no way to delete records? At all? If that's the case, does it mean that table needs to be deleted and recreated from scratch? If that's not the case. Can you please suggest a workaround to avoid this issue?
Thanks!
Including new error message for SEO: "UPDATE or DELETE statement over table ... would affect rows in the streaming buffer, which is not supported" -- Fh
Streaming buffer: The buffer that retains recently inserted rows, and is optimized for high-throughput writes rather than columnar access. Instant availability reader: Allows the query engine to read records directly from the streaming buffer. Columnar storage: Data associated with a table that's in columnar format.
The BigQuery data manipulation language (DML) enables you to update, insert, and delete data from your BigQuery tables.
To update to the latest BigQuery data, at the bottom of the pivot table, click Refresh.
You can delete a BigQuery table in the following ways: BigQuery Delete Table: Using the Cloud Console. BigQuery Delete Table: Using a Data Definition Language (DDL) DROP TABLE statement. BigQuery Delete Table: Using the bq command-line tool bq rm command.
To check if the table has a streaming buffer, check the tables.get
response for a section named streamingBuffer
or, when streaming to a partitioned table, data in the streaming buffer has a NULL value for the _PARTITIONTIME
pseudo column, so even with a simple WHERE query can be checked.
Streamed data is available for real-time analysis within a few seconds of the first streaming insertion into a table but it can take up to 90 minutes to become available for copy/export and other operations. You probably have to wait up to 90 minutes so all buffer is persisted on the cluster. You can use queries to see if the streaming buffer is empty or not like you mentioned.
If you use load job to create the table, you won't have streaming buffer, but probably you streamed some values to it.
Note the answer below to work with tables that have ongoing streaming buffers. Just use a WHERE
to filter out the latest minutes of data and your queries will work. -- Fh
Make sure to change your filters so they don't include data that could be in the current streaming buffer.
For example, this query fails while I'm streaming to this table:
DELETE FROM `project.dataset.table` WHERE id LIKE '%-%' Error: UPDATE or DELETE statement over table project.dataset.table would affect rows in the streaming buffer, which is not supported
You can fix it by only deleting older records:
DELETE FROM `project.dataset.table` WHERE id LIKE '%-%' AND ts < TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 40 MINUTE) 4282 rows affected.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With