Extract BigQuery partitioned table

Tags:

google-bigquery

Is there a way to extract the complete BigQuery partitioned table with one command so that data of each partition is extracted into a separate folder of the format part_col=date_yyyy-mm-dd

Since Bigquery partitioned table can read files from the hive type partitioned directories, is there a way to extract the data in a similar way. I can extract each partition separately, however that is very cumbersome when i an extracting a lot of partitions

862

asked Jul 02 '19 14:07

Trishit Ghosh

1 Answers

You could do this programmatically. For instance, you can export partitioned data by using the partition decorator such as table$20190801. And then on the bq extract command you can use URI Patterns (look the example of the workers pattern) for the GCS objects.

Since all objects will be within the same bucket, the folders are just an hierarchical illusion, so you can specify URI patterns on the folders as well, but not on the bucket.

So you would do a script where you loop over the DATE value, with something like:

bq extract 
--destination_format [CSV, NEWLINE_DELIMITED_JSON, AVRO] 
--compression [GZIP, AVRO supports DEFLATE and SNAPPY] 
--field_delimiter [DELIMITER] 
--print_header [true, false] 
[PROJECT_ID]:[DATASET].[TABLE]$[DATE]
gs://[BUCKET]/part_col=[DATE]/[FILENAME]-*.[csv, json, avro]

You can't do it automatically with just a bq command. For this it would be better to raise a feature request as suggested by Felipe.

171

answered Sep 26 '22 13:09

Héctor Neri

Related questions
                            
                                Google BigQuery Date Data Type?
                            
                                summary of all bq jobs
                            
                                BigQuery: How can I change the type of one of my column from INTEGER to STRING?
                            
                                BigQuery max query length characters work around
                            
                                How to cast float to string with no exponents in BigQuery
                            
                                Any command to get big query execution plan?
                            
                                conditional join in bigquery
                            
                                Best way to migrate large amount of data from US dataset to EU dataset in BigQuery?
                            
                                How to Sync Mysql into Bigquery in realtime?
                            
                                How do I add a nested field to my BigQuery table schema?
                            
                                Google Bigquery: Partitioning specification needed for copying date partitioned table
                            
                                Extracting date from timestamp in Bigquery: a preferable method
                            
                                How to strip non-numeric characters from BigQuery results
                            
                                How to count frequency of elements in a bigquery array field
                            
                                Google Cloud SDK update cause error for bq
                            
                                UNNEST Multiple Fields From the same Table - BigQuery
                            
                                Creating partitioned external table in bigquery
                            
                                unable to run query against BigQuery - permission error 403
                            
                                How do I turn on cost controls on BigQuery?
                            
                                Is there a way to increase allotted memory for queries in BigQuery?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With