I'm trying to use Google Workflows to do some automated BigQuery scheduling tasks. The requirement is to run a query on multiple datasets as the following:
- execute_query_job:
call: execute_query_job
args:
query_text: >-
SELECT
* EXCEPT(row_number)
FROM (
SELECT *, ROW_NUMBER() OVER (PARTITION BY uuid) row_number
FROM
`project.${database_id}.table`)
WHERE
row_number = 1
however, this doesn't work since the string is interpreted as is and no interpolation happened.
The ${} syntax will not span over multiple lines and ansible syntax with {{ var }} also did not work.
Try to change the query to a single line in a similar fashion as:
- execute_query_job:
call: execute_query_job
args:
query_text: ${"SELECT * EXCEPT(row_number) FROM (SELECT *, ROW_NUMBER() OVER (PARTITION BY uuid) row_number FROM `project."+database_id+".table`) WHERE row_number = 1"}
Notice that as per the Workflow's docs:
Variables can be assigned to a particular value or to the result of an expression.
If that doesn't work notice that making a POST request to the BigQuery's API jobs.insert method will allow you to specify a JobConfiguration, where you could change the defaultDataset field and change this values for each different dataset at each iteration. The following sample shows how to make iterations based on the values of an array in Workflows.
You may want take a look at the official document. you could concat the variable by multiple lines.
- assign_vars:
assign:
- string: "say"
- string: ${string+" "+"hello"}
- string: ${string+" "+"to the world"}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With