Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Airflow giving log file does not exist error while running on Docker

The scheduler and the webserver are being run on different containers and when I run a DAG and check the logs on the webserver, it shows me this particular error.

*** Log file does not exist: /usr/local/airflow/logs/indexing/index_articles/2019-12-31T00:00:00+00:00/1.log
*** Fetching from: http://465e0f4a4332:8793/log/indexing/index_articles/2019-12-31T00:00:00+00:00/1.log
*** Failed to fetch log file from worker. HTTPConnectionPool(host='465e0f4a4332', port=8793): Max retries exceeded with url: /log/indexing/index_articles/2019-12-31T00:00:00+00:00/1.log (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f0a143700d0>: Failed to establish a new connection: [Errno 111] Connection refused'))

I set the airflow variables as mentioned in this other similar question and the only variables that I'm changing on the cfg files are these.

AIRFLOW__CORE__SQL_ALCHEMY_CONN=postgresql+psycopg2://airflow:airflow@postgres:5432/airflow
AIRFLOW__CORE__LOAD_EXAMPLES=False
AIRFLOW__CORE__BASE_URL = http://{hostname}:8080

I manually checked and log files are being generated properly, I'm assuming the only problem is the url not being publically accessible through the webserver container. I'm not sure where I'm messing it up and I'm running and testing this in the local.

like image 872
isht3 Avatar asked Jan 04 '20 13:01

isht3


People also ask

Can Airflow run docker container?

After creating DAG you just need to run your DAG on a Docker container. For this, you need to link your local machine with the container. You will give the path of DAGs in your directory. For the sake of simplicity, you will write the default path which is ​​home/user/airflow/dags for running Airflow in Docker.

How do I get Airflow logs?

You can also view the logs in the Airflow web interface. Streaming logs: These logs are a superset of the logs in Airflow. To access streaming logs, you can go to the logs tab of Environment details page in Google Cloud console, use the Cloud Logging, or use Cloud Monitoring. Logging and Monitoring quotas apply.

Does docker have a log file?

The docker logs command shows information logged by a running container. The docker service logs command shows information logged by all containers participating in a service. The information that is logged and the format of the log depends almost entirely on the container's endpoint command.


1 Answers

The problem is because the docker containers do not share a filesystem. This is indicated by the first line of the response.

Airflow then falls back to attempting to fetch the log-file over HTTP, as indicted by the second line of the response. Other answers try to fix this by overriding the HOSTNAME_CALLABLE function, however this will not work unless the host is exposing the logfiles over HTTP.

The solution is to fix the first problem by mounting a shared volume.

In your docker-compose.yml file, add a new volume called logs-volume.

volumes:
  logs-volume:

Then, also in the docker-compose.yml file, add mount this volume to the required log directory, in your case /usr/local/airflow/logs/, for each service:

services: 
  worker:
    volumes:
      - logs-volume:/usr/local/airflow/logs
  webserver:
    volumes:
      - logs-volume:/usr/local/airflow/logs
like image 69
James James Avatar answered Sep 19 '22 12:09

James James