I have uploaded a CSV file with 300K rows from GCS to BigQuery, and received the following error:
Where can I find the error stream?
I've changed the create table configuration to allow 4000 errors and it worked, so it must be a problem with the 3894 rows in the message, but this error message does not tell me much about which rows or why.
Thanks
BigQuery hasn't documented it yet, but you can handle any type of exception in BigQuery by creating an exception handling clause, as described in the following example: BEGIN SELECT 1/0; EXCEPTION WHEN ERROR THEN SELECT @@error. message, @@error.
Listing datasets in a project. In the navigation menu, click SQL workspace. In the Explorer panel, expand a project name to see the datasets in that project, or use the search box to search by dataset name.
You can query the INFORMATION_SCHEMA. JOBS_BY_* view to retrieve real-time metadata about BigQuery jobs. This view contains currently running jobs, as well as the last 180 days of history of completed jobs. Note: Valid states include PENDING, RUNNING, and DONE.
I'm finally managed to see the error stream by running the following command in the terminal:
bq --format=prettyjson show -j <JobID>
It returns a JSON with more details. In my case it was:
"message": "Error while reading data, error message: Could not parse '16.66666666666667' as int for field Course_Percentage (position 46) starting at location 1717164"
You should be able to click on Job History
in the BigQuery UI, then click the failed load job. I tried loading an invalid CSV file just now, and the errors that I see are:
Errors: Error while reading data, error message: CSV table encountered too many errors, giving up. Rows: 1; errors: 1. Please look into the error stream for more details. (error code: invalid) Error while reading data, error message: CSV table references column position 1, but line starting at position:0 contains only 1 columns. (error code: invalid)
The first one is just a generic message indicating the failure, but the second error (from the "error stream") is the one that provides more context for the failure, namely CSV table references column position 1, but line starting at position:0 contains only 1 columns
.
Edit: given a job ID, you can also use the BigQuery CLI to see complete information about the failure. You would use:
bq --format=prettyjson show -j <job ID>
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With