Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Airflow "This DAG isnt available in the webserver DagBag object "

when I put a new DAG python script in the dags folder, I can view a new entry of DAG in the DAG UI but it was not enabled automatically. On top of that, it seems does not loaded properly as well. I can only click on the Refresh button few times on the right side of the list and toggle the on/off button on the left side of the list to be able to schedule the DAG. These are manual process as I need to trigger something even though the DAG Script was put inside the dag folder.

Anyone can help me on this ? Did I missed something ? Or this is a correct behavior in airflow ?

By the way, as mentioned in the post title, there is an indicator with this message "This DAG isn't available in the webserver DagBag object. It shows up in this list because the scheduler marked it as active in the metdata database" tagged with the DAG title before i trigger all this manual process.

like image 380
santi Avatar asked Jan 10 '17 03:01

santi


1 Answers

It is not you nor it is correct or expected behavior. It is a current 'bug' with Airflow. The web server is caching the DagBag in a way that you cannot really use it as expected.

"Attempt removing DagBag caching for the web server" remains on the official TODO as part of the roadmap, indicating that this bug may not yet be fully resolved, but here are some suggestions on how to proceed:

only use builders in airflow v1.9+

Prior to airflow v1.9 this occurs when a dag is instantiated by a function which is imported into the file where instantiation happens. That is: when a builder or factory pattern is used. Some reports of this issue on github 2 and JIRA 3 led to a fix released with in airflow v1.9.

If you are using an older version of airflow, don't use builder functions.

airflow backfill to reload the cache

As Dmitri suggests, running airflow backfill '<dag_id>' -s '<date>' -e '<date>' for the same start and end date can sometimes help. Thereafter you may end up with the (non)-issue that Priyank points, but that is expected behavior (state: paused or not) depending on the configuration you have in your installation.

like image 174
Guille Avatar answered Oct 17 '22 08:10

Guille