To separate bigquery queries from the actual code I want to store the sql in a separate file and then read it from the python code. I have tried to add the file in the same bucket as the DAGs and also in a sub folder, but it seems like I can't read the file when it airflow is running my python script with the sql files.
What I want is this:
gs://my-bucket/dags -> store dags
gs://my-bucket/dags/sql -> store sql files
The sql files might be files that I need to read first to inject things that is not supported by the jinja templating.
Can I do the above?
# Define a DAG (directed acyclic graph) of tasks. # Any task you create within the context manager is automatically added to the. # DAG object.
To add or update a DAG, move the Python . py file for the DAG to the /dags folder in the environment's bucket. In Google Cloud console, go to the Environments page. In the list of environments, find a row with the name of your environment and in the DAGs folder column click the DAGs link.
Cloud Composer is a fully managed workflow orchestration service built on Apache Airflow that helps you author, schedule, and monitor pipelines spanning hybrid and multi-cloud environments.
Cloud Composer mounts the GCS bucket using a FUSE driver from gs://my-bucket
to /home/airflow/gcs/
. This means that the gs://my-bucket/dags
folder is available in the scheduler, web server, and workers at /home/airflow/gcs/dags
.
Your DAGs should be able to read the SQL files from the /home/airflow/gcs/dags/sql
directory.
Note: the /home/airflow/gcs/data
directory is available on workers but not the webserver.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With