I want to generate some reports based on the data in BigTable. For that I wanted to create a query which will get the latest data from the BigTable and pass it over onto the data studio report. Now the problem is that when I create a BigTable connection in BigQuery I cannot execute any query, even on the empty table. I create the type for BigQuery in the following way:
bq mk \
--external_table_definition=gs://somebucket/big-table-definition.json \
datareportingdataset.datareportingtable
and the command executes successfully.
My big-table-definition.json
looks like the following:
{
"sourceFormat": "BIGTABLE",
"sourceUris": [
"https://googleapis.com/bigtable/projects/playground-2/instances/data-reporting/tables/data-reporting-table"
],
"bigtableOptions": {
"readRowkeyAsString": "true",
"columnFamilies" : [
{
"familyId": "cf1",
"onlyReadLatest": "true",
"columns": [
{
"qualifierString": "temp",
"type": "STRING"
},
{
//the rest of the columns
]
}
]
}
}
The error when doing simple select *
query looks like the following:
Error while reading table: datareportingdataset.datareportingtable, error message: Error detected while parsing row starting at position: 2. Error: Data between close double quote (") and field separator.
First I was suspecting some data in the BigTable but when I deleted everything from there the error still occurs. I have found out that it must be something with the json file itself as when I move the "sourceFormats" down a few lines the error changes position reported. What am I doing wrong here?
just reproduced your case and found the exact same error. It looks to me like when you run the bq mk command it isn't extracting any data at all.
As a workarroud I would suggest you to run a Dataflow job to extract your data to Cloud Storage as an .avro file and then import your data into a dataset in Bigquery.
I think I found the problem as I was able to reproduce it. The error message is confusing but, as documented here:
You must create the JSON schema file manually, and it must be on your local machine. Referencing a JSON schema file stored in Cloud Storage or in Google Drive is not supported.
I did some tests with the Bigtable quickstart and it worked well for me:
bq query "SELECT * FROM DATASET.datareportingtable"
Waiting on JOB_ID ... (3s) Current status: DONE
+--------+-----------------+---------------------------+-----------------------+
| rowkey | cf1_column_name | cf1_column_cell_timestamp | cf1_column_cell_value |
+--------+-----------------+---------------------------+-----------------------+
| r1 | c1 | 2019-10-15 18:15:04 | test-value |
+--------+-----------------+---------------------------+-----------------------+
The only thing I did different was using a local path as in:
--external_table_definition=big-table-definition.json
Changing this back to:
--external_table_definition=gs://$BUCKET/big-table-definition.json
And I got the same error:
bq query "SELECT * FROM DATASET.datareportingtable2"
Waiting on JOB_ID ... (0s) Current status: DONE
BigQuery error in query operation: Error processing job 'PROJECT:JOB_ID': Error while reading table: DATASET.datareportingtable2, error message: Error detected while parsing row starting at
position: 2. Error: Data between close double quote (") and field separator.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With