Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Google Bigquery: Partitioning specification needed for copying date partitioned table

Note: this is nearly a duplicate of this question with the distinction that in this case, the source table is date partitioned and the destination table does not yet exist. Also, the accepted solution to that question didn't work in this case.

I'm trying to copy a single day's worth of data from one date partitioned table into a new date partitoined table that I have not yet created. My hope is that BigQuery would simply create the date-partitioned destination table for me like it usually does for the non-date-partitioned case.

Using BigQuery CLI, here's my command:

bq cp mydataset.sourcetable\$20161231 mydataset.desttable\$20161231

Here's the output of that command:

BigQuery error in cp operation: Error processing job 'myproject:bqjob_bqjobid': Partitioning specification must be provided in order to create partitioned table

I've tried doing something similar using the python SDK: running a select command on a date partitioned table (which selects data from only one date partition) and saving the results into a new destination table (which I hope would also be date partitioned). The job fails with the same error:

{u'message': u'Partitioning specification must be provided in order to create partitioned table', u'reason': u'invalid'}

Clearly I need to add a partitioning specification, but I couldn't find any documentation on how to do so.

like image 863
conradlee Avatar asked Jun 16 '17 10:06

conradlee


People also ask

How do I copy a table in BigQuery?

In the BigQuery UI, select the table you wish to copy, then push the Copy Table button. Enter the desired new table name. BigQuery documentation lists additional methods for copying a table (via API, with Python, PHP, etc).

How many partitions can you have in a BigQuery table?

BigQuery has a limit of 4,000 partitions per table.

How would you query specific partitions in a BigQuery table?

If you want to query data based on a time zone other than UTC, choose one of the following options: Adjust for time zone differences in your SQL queries. Use partition decorators to load data into specific ingestion-time partitions, based on a different time zone than UTC.


1 Answers

You need to create the partitioned destination table first (as per the docs):

If you want to copy a partitioned table into another partitioned table, the partition specifications for the source and destination tables must match.

So, just create the destination partitioned table before you start copying. If you can't be bothered specifying the schema, you can create the destination partitioned table like so:

bq mk --time_partitioning_type=DAY mydataset.temps

Then, use a query instead of a copy to write to the destination table. The schema will be copied with it:

bq query --allow_large_results --replace --destination_table 'mydataset.temps$20160101''SELECT * from `source`'
like image 144
Graham Polley Avatar answered Sep 18 '22 07:09

Graham Polley