Here's the case: <ul> <li>Our client daily uploads CSVs (overwritten) to a bucket in Google Cloud Storage (each table in a different file).</li> <li>We use BigQuery as DataSource in DataStudio</li> <li>We want to automatically transfer the CSVs to BigQuery.</li> </ul> The thing is, even though we've: <ol> <li>Declared the tables in BigQuery with "Overwrite table" write preference option</li> <li>Configured the daily Transfers vía UI (BigQuery > Transfers) to automatically upload the CSVs from Google Cloud one hour after the files are uploaded to Google Cloud, as stated by the limitations.</li> </ol> The automated transfer/load is by default in "WRITE_APPEND", thus the tables are appended instead of overwritten in BigQuery. Hence the question: How/where can we change the <blockquote> configuration.load.writeDisposition = WRITE_TRUNCATE </blockquote> as stated here in order to overwrite the tables when the CSVs are automatically loaded? I think that's what we're missing. Cheers.

None of the above worked for us, so I'm posting this in case anyone has the same issue. We scheduled a query to erase the table content just before the automatic importation process starts: <pre class="prettyprint"><code>DELETE FROM project.tableName WHERE true </code></pre> And then, new data will be imported to a void table, therefore default "WRITE_APPEND" doesn't affect us.

BigQuery - Transfers automation from Google Cloud Storage - Overwrite table

2 Answers

None of the above worked for us, so I'm posting this in case anyone has the same issue.

We scheduled a query to erase the table content just before the automatic importation process starts:

DELETE FROM project.tableName WHERE true

And then, new data will be imported to a void table, therefore default "WRITE_APPEND" doesn't affect us.

200

answered Oct 04 '22 20:10

gabi493

1) One way to do this is to use DDL to CREATE and REPLACE your table before running the query which imports the data.

This is an example of how to create a table

#standardSQL
 CREATE TABLE mydataset.top_words
 OPTIONS(
   description="Top ten words per Shakespeare corpus"
 ) AS
 SELECT
   corpus,
   ARRAY_AGG(STRUCT(word, word_count) ORDER BY word_count DESC LIMIT 10) AS top_words
 FROM bigquery-public-data.samples.shakespeare
 GROUP BY corpus;

Now that it's created you can import your data.

2) Another way is to use BigQuery schedule Queries enter image description here

3) If you write Python you can find an even better solution here

answered Oct 04 '22 22:10

Tamir Klein

Related questions
                            
                                Sending specific keys on the Numpad like +, -, / or Enter (simulating a keypress)
                            
                                push deployment with test automation
                            
                                Testing with Robolectric and ANT
                            
                                Run job every month on a specific day (with anacron?)
                            
                                Notifying when using high CPU. Via AppleScript or Automator?
                            
                                automated push to a github repo with travis
                            
                                How to prove left-recursive grammar is not in LL(1) using parsing table
                            
                                Visual Studio 2013: auto-refresh solution explorer in "show all files" mode
                            
                                automated mapping in groovy for ODI with expression
                            
                                Android Espresso Test gets stuck at perform(click());
                            
                                How to get access token from Identity Server by passing username and password?
                            
                                Is there a way to export Allure Report to a single html file? To share with the team
                            
                                Google Cloud Platform - Deploy a Cloud Function that starts a webdriver
                            
                                How to set and access an environment variable in GitHub Actions?
                            
                                Rules for using update sites in Eclipse?
                            
                                Printing HUGE PDF batches in order
                            
                                reusing Internet Explorer COM Automation Object
                            
                                Word 2010 automation: 'goto bookmark'
                            
                                How to run applescript at scheduled time
                            
                                Is there any recommended way to automate module port connection?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

BigQuery - Transfers automation from Google Cloud Storage - Overwrite table

Tags:

overwrite

automation

google-cloud-storage

google-bigquery

gabi493

People also ask

2 Answers

gabi493

Tamir Klein

Recent Activity

Donate For Us