I have an airflow service that is currently running as separate docker containers for the webserver and scheduler, both backed by a postgres database. I have the dags synced between the two instances and the dags load appropriately when the services start. However, if I add a new dag to the dag folder (on both containers) while the service is running, the dag gets loaded into the dagbag but show up in the web gui with missing metadata. I can run "airflow initdb" after each update but that doesn't feel right. Is there a better way for the scheduler and webserver to sync up with the database?
Dag updates should be picked up automatically. If they don't get picked up, it's often because the change you made "broke" the dag.
To check that new tasks are in fact picked up, on your webserver, run:
airflow list_tasks <dag_name> --tree
If it says Dag not found, then there's an error.
If it runs successfully, then it should show all your tasks and those tasks should be picked up in your airflow ui when you refresh it.
If the new/updated tasks are not showing up there, then check the dags folder on your webserver and verify that the code is indeed being updated.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With