Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Gcloud topic escaping in Apache Beam

I'm trying to run a dataflow job through gcloud command:

gcloud beta dataflow jobs run test --gcs-location gs://bucket/templates/templateName --parameters query="select a.name,b.salary,a.id from table1 a join table2 b on a.id = b.id"

But I get an error saying:

ERROR: (gcloud.beta.dataflow.jobs.run) argument --parameters: Bad syntax for dict arg: [b.salary]. Please see gcloud topic escaping if you would like information on escaping list or dictionary flag values.

I saw the documentation for gcloud topic escaping but cannot figure out how to apply that here.Can somebody please help me with this.

Thanks.

like image 495
rish0097 Avatar asked Apr 02 '26 05:04

rish0097


2 Answers

The parameters argument takes a dictionary as its argument. As specified in gcloud topic escaping, you need to specify a delimiter between the dictionnary's elements, even though we only have one element here.

Therefore we can just give an arbitrary delimiter like ":", using (notice the change before query=):

gcloud beta dataflow jobs run test --gcs-location gs://bucket/templates/templateName --parameters ^:^query="select a.name,b.salary,a.id from table1 a join table2 b on a.id = b.id"


On an actual template (provided by google): gcloud beta dataflow jobs run test --gcs-location=gs://dataflow-templates/wordcount/template_file --parameters ^:^query="select a.name,b.salary,a.id from table1 a join table2 b on a.id = b.id"

This returns INVALID_ARGUMENT: (bf23ae8a2a6f1efe): The workflow could not be created. Causes: (bf23ae8a2a6f165b): Found unexpected parameters: ['query' (perhaps you meant 'runner')], which shows that we have indeed fixed the issue: dataflow properly understands we're passing a query parameter. However the google template uses no such parameter and therefore throws an error, which is the expected behavior.

like image 101
The hBar Tender Avatar answered Apr 08 '26 18:04

The hBar Tender


Thanks a great bunch for this. For anyone still confused, here is a sample with multiple elements. The delimiter is ~

gcloud dataflow jobs run <INPUT> \
--gcs-location=gs://dataflow-templates-us-central1/latest/Jdbc_to_BigQuery \
--region=northamerica-northeast1 \
--network=<INPUT> \
--subnetwork=<INPUT> \
--parameters ^~^driverClassName=com.mysql.jdbc.Driver~driverJars="<INPUT>"~connectionProperties="<INPUT>"~outputTable="<INPUT>"~password="<INPUT>"~username="<INPUT>"~bigQueryLoadingTemporaryDirectory="<INPUT>"~connectionURL="<INPUT>"~query="select name, age, location from testtable"
like image 43
verygreatmove Avatar answered Apr 08 '26 19:04

verygreatmove



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!