When the airflow webserver shows up errors like Broken DAG: [<path/to/dag>] <error>
, how and where can we find the full stacktrace for these exceptions?
I tried these locations:
/var/log/airflow/webserver
-- had no logs in the timeframe of execution, other logs were in binary and decoding with strings
gave no useful information.
/var/log/airflow/scheduler
-- had some logs but were in binary form, tried to read them and looked to be mostly sqlalchemy logs probably for airflow's database.
/var/log/airflow/worker
-- shows up the logs for running DAGs, (same as the ones you see on the airflow page)
and then also under /var/log/airflow/rotated
-- couldn't find the stacktrace I was looking for.
I am using airflow v1.7.1.3
You can also view the logs in the Airflow web interface. Streaming logs: These logs are a superset of the logs in Airflow. To access streaming logs, you can go to the logs tab of Environment details page in Google Cloud console, use the Cloud Logging, or use Cloud Monitoring. Logging and Monitoring quotas apply.
You can set your scheduler service to restart every few minutes, it should pick up new dags after getting restarted. Just use airflow scheduler -r 300 , this means that the scheduler exits every 300 seconds, so if you set up your service to always restart the scheduler, every new dag should get loaded within < 5 mins.
Usually I used the command airflow list_dags
which print the full stacktrace for python error found in dags.
That will work with almost any airflow command as airflow parse dags folder each time you use a airflow CLI command.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With