Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Using multiple '*' patterns when loading into BigQuery won't work

We're trying to use a glob pattern when loading into BigQuery, for example:

gs://<bucket_name>/Network*Impressions_12345_20150201*

We have both "..NetworkImpressions_.." and "..NetworkBackfillImpressions_.." in our bucket, so we use the first '*' to scoop up both types of files. But BQ borks with:

"Not found: URI gs://backup-gdfp-7415/Network*Impressions_232503_20150101_20*"

The files definitely exist. If we remove the first '*' it works fine (and when we explicitly specify both types).

Here's a job id for a failed load job where we are trying to use the pattern: job_LXNGEAeWsaU9HyFgcCCJMHu8YtY

I would have thought this should be possible with BigQuery?

like image 669
Graham Polley Avatar asked Oct 19 '22 16:10

Graham Polley


1 Answers

From the documentation for load job configuration sourceUris parameter:

[Required] The fully-qualified URIs that point to your data in Google Cloud Storage. Wildcard names are only supported when they appear at the end of the URI.

like image 129
Danny Kitt Avatar answered Oct 22 '22 19:10

Danny Kitt