Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Debugging Broken DAGs

Tags:

When the airflow webserver shows up errors like Broken DAG: [<path/to/dag>] <error>, how and where can we find the full stacktrace for these exceptions?

I tried these locations:

/var/log/airflow/webserver -- had no logs in the timeframe of execution, other logs were in binary and decoding with strings gave no useful information.

/var/log/airflow/scheduler -- had some logs but were in binary form, tried to read them and looked to be mostly sqlalchemy logs probably for airflow's database.

/var/log/airflow/worker -- shows up the logs for running DAGs, (same as the ones you see on the airflow page)

and then also under /var/log/airflow/rotated -- couldn't find the stacktrace I was looking for.

I am using airflow v1.7.1.3

like image 431
arbazkhan002 Avatar asked May 12 '17 18:05

arbazkhan002


People also ask

How do you check Airflow DAG logs?

You can also view the logs in the Airflow web interface. Streaming logs: These logs are a superset of the logs in Airflow. To access streaming logs, you can go to the logs tab of Environment details page in Google Cloud console, use the Cloud Logging, or use Cloud Monitoring. Logging and Monitoring quotas apply.

How often does Airflow check for new DAGs?

You can set your scheduler service to restart every few minutes, it should pick up new dags after getting restarted. Just use airflow scheduler -r 300 , this means that the scheduler exits every 300 seconds, so if you set up your service to always restart the scheduler, every new dag should get loaded within < 5 mins.


1 Answers

Usually I used the command airflow list_dags which print the full stacktrace for python error found in dags.

That will work with almost any airflow command as airflow parse dags folder each time you use a airflow CLI command.

like image 157
Babcool Avatar answered Oct 08 '22 07:10

Babcool