Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Google Cloud Composer (Apache Airflow) cannot access log files

I'm running a DAG in Google Cloud Composer (hosted Airflow) which runs fine in Airflow locally. All it does is print "Hello World". However, when I run it through Cloud Composer I receive the error:

*** Log file does not exist: /home/airflow/gcs/logs/matts_custom_dag/main_test/2020-04-20T23:46:53.652833+00:00/2.log
*** Fetching from: http://airflow-worker-d775d7cdd-tmzj9:8793/log/matts_custom_dag/main_test/2020-04-20T23:46:53.652833+00:00/2.log
*** Failed to fetch log file from worker. HTTPConnectionPool(host='airflow-worker-d775d7cdd-tmzj9', port=8793): Max retries exceeded with url: /log/matts_custom_dag/main_test/2020-04-20T23:46:53.652833+00:00/2.log (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f8825920160>: Failed to establish a new connection: [Errno -2] Name or service not known',))

I've also tried making the DAG add data into a database and it actually succeeds 50% of the time. However, it always returns this error message (and no other print statements or logs). Any help much appreciated on why this might be happening.

like image 374
Matt Avatar asked Mar 02 '23 13:03

Matt


1 Answers

We also faced the same issue then raised a support ticket to GCP and got the following reply.

  1. The message is related to the latency of syncing logs from Airflow workers to WebServer, it takes at least some minutes (depending on the number of objects and their size) The total log size seems not large but it’s enough to noticeably slow down synchronization, hence, we recommend cleanup/archive the logs

  2. Basically we recommend relying on Stackdriver logs instead, because of latency due to the design of this sync

I hope this will help you solve the problem.

like image 182
SANN3 Avatar answered Apr 01 '23 14:04

SANN3