Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Airflow: New DAG is not found by webserver

In Airflow, how should I handle the error "This DAG isn't available in the webserver DagBag object. It shows up in this list because the scheduler marked it as active in the metadata database"?

I've copied a new DAG to an Airflow server, and have tried:

  • unpausing it and refreshing it (basic operating procedure, given in this previous answer https://stackoverflow.com/a/42291683/160406)
  • restarting the webserver
  • restarting the scheduler
  • stopping the webserver and scheduler, resetting the database (airflow resetdb), then starting the webserver and scheduler again
  • running airflow backfill (suggested here Airflow "This DAG isnt available in the webserver DagBag object ")
  • running airflow trigger_dag

The scheduler log shows it being processed and no errors occurring, I can interact with it and view it's state through the CLI, but it still does not appear in the web UI.

Edit: the webserver and scheduler are running on the same machine with the same airflow.cfg. They're not running in Docker.

They're run by Supervisor, which runs them both as the same user (airflow). The airflow user has read, write and execute permission on all of the dag files.

like image 578
Ollie Glass Avatar asked Apr 28 '17 16:04

Ollie Glass


People also ask

Why DAG is not appearing in Airflow?

Create a subdirectory called dags in your main project directory and move your DAG there. Then refresh the Airflow UI and you should be able to see it. Note that the AIRFLOW_HOME should be set to be your main project directory.

How do I load new DAGs in Airflow?

Go to the folder that you've designated to be your AIRFLOW_HOME and find the DAGs folder located in subfolder dags/ (if you cannot find, check the setting dags_folder in $AIRFLOW_HOME/airflow. cfg ). Create a Python file with the name airflow_tutorial.py that will contain your DAG.

Where is DAG located Airflow?

The default location for your DAGs is ~/airflow/dags .


2 Answers

This helped me...

pkill -9 -f "airflow scheduler"

pkill -9 -f "airflow webserver"

pkill -9 -f "gunicorn"

then restart the airflow scheduler and webserver.

like image 78
viru Avatar answered Sep 19 '22 21:09

viru


Just had this issue myself. After changing permissions, resetting the meta database, restarting the webserver & even making some potential code changes to rectify the situation, it didn't happen.

However, I noticed that even though we were stopping the webserver, our gunicorn process was still running. Killing these processes & then starting everything back up resulted in success

like image 27
justcompile Avatar answered Sep 20 '22 21:09

justcompile