Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do I read a file in the airflow cloud composer bucket?

To separate bigquery queries from the actual code I want to store the sql in a separate file and then read it from the python code. I have tried to add the file in the same bucket as the DAGs and also in a sub folder, but it seems like I can't read the file when it airflow is running my python script with the sql files.

What I want is this:

gs://my-bucket/dags -> store dags
gs://my-bucket/dags/sql -> store sql files

The sql files might be files that I need to read first to inject things that is not supported by the jinja templating.

Can I do the above?

like image 731
Tomas Jansson Avatar asked May 24 '18 14:05

Tomas Jansson


People also ask

What is Dag in cloud composer?

# Define a DAG (directed acyclic graph) of tasks. # Any task you create within the context manager is automatically added to the. # DAG object.

How do you upload a DAG file?

To add or update a DAG, move the Python . py file for the DAG to the /dags folder in the environment's bucket. In Google Cloud console, go to the Environments page. In the list of environments, find a row with the name of your environment and in the DAGs folder column click the DAGs link.

What is the use of cloud composer?

Cloud Composer is a fully managed workflow orchestration service built on Apache Airflow that helps you author, schedule, and monitor pipelines spanning hybrid and multi-cloud environments.


1 Answers

Cloud Composer mounts the GCS bucket using a FUSE driver from gs://my-bucket to /home/airflow/gcs/. This means that the gs://my-bucket/dags folder is available in the scheduler, web server, and workers at /home/airflow/gcs/dags.

Your DAGs should be able to read the SQL files from the /home/airflow/gcs/dags/sql directory.

Note: the /home/airflow/gcs/data directory is available on workers but not the webserver.

like image 78
Tim Swast Avatar answered Sep 25 '22 04:09

Tim Swast