Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I load data into BigQuery without ProtoBuf format error?

In loading data into BigQuery, I get the following error (copied from Job History in BigQuery web console).

Errors:
query: Failed to load FileDescriptorProto for '_GEN_DREMEL_ONESTORE_METADATA_SCHEMA_': (error code: invalidQuery)
 Field numbers 19000 through 19999 are reserved for the protocol buffer library implementation.
 Field numbers 19000 through 19999 are reserved for the protocol buffer library implementation. 
 [... repeated a total of exactly 1000 times...]
 Field numbers 19000 through 19999 are reserved for the protocol buffer library implementation. 

 (error code: invalidQuery)

The data is a Datastore Managed Backup. (The folks from that team sent me to BigQuery for help.)

The error occurs with one of six randomly selected Kinds; the others load successfully. In addition, loading another Kind gives the error "too many fields: 10693 (error code: invalid)".

Both the failed Kind and the successful ones have a similar size of ~15 gigabytes of data.

What can we do to load this data?

like image 319
Joshua Fox Avatar asked Oct 19 '22 00:10

Joshua Fox


1 Answers

This was caused by BigQuery's limitation: A maximum of 10000 columns per table. So the utility for loading a Datastore backup simply does not work in this case.

like image 157
Joshua Fox Avatar answered Dec 30 '22 22:12

Joshua Fox