I've been looking through various answers on this topic but haven't been able to get a working solution.
I have airflow setup to Log to s3 but the UI seems to only use File based task handler instead of the S3 one specified.
I have the s3 connection setup as follows
Conn_id = my_conn_S3
Conn_type = S3
Extra = {"region_name": "us-east-1"}
(the ECS instance use a role that has full s3 permissions)
I have created a log_config file with the following settings also
remote_log_conn_id = my_conn_S3
encrypt_s3_logs = False
logging_config_class = log_config.LOGGING_CONFIG
task_log_reader = s3.task
And in my log config I have the following setup
LOG_LEVEL = conf.get('core', 'LOGGING_LEVEL').upper()
LOG_FORMAT = conf.get('core', 'log_format')
BASE_LOG_FOLDER = conf.get('core', 'BASE_LOG_FOLDER')
PROCESSOR_LOG_FOLDER = conf.get('scheduler', 'child_process_log_directory')
FILENAME_TEMPLATE = '{{ ti.dag_id }}/{{ ti.task_id }}/{{ ts }}/{{ try_number }}.log'
PROCESSOR_FILENAME_TEMPLATE = '{{ filename }}.log'
S3_LOG_FOLDER = 's3://data-team-airflow-logs/airflow-master-tester/'
LOGGING_CONFIG = {
'version': 1,
'disable_existing_loggers': False,
'formatters': {
'airflow.task': {
'format': LOG_FORMAT,
},
'airflow.processor': {
'format': LOG_FORMAT,
},
},
'handlers': {
'console': {
'class': 'logging.StreamHandler',
'formatter': 'airflow.task',
'stream': 'ext://sys.stdout'
},
'file.processor': {
'class': 'airflow.utils.log.file_processor_handler.FileProcessorHandler',
'formatter': 'airflow.processor',
'base_log_folder': os.path.expanduser(PROCESSOR_LOG_FOLDER),
'filename_template': PROCESSOR_FILENAME_TEMPLATE,
},
# When using s3 or gcs, provide a customized LOGGING_CONFIG
# in airflow_local_settings within your PYTHONPATH, see UPDATING.md
# for details
's3.task': {
'class': 'airflow.utils.log.s3_task_handler.S3TaskHandler',
'formatter': 'airflow.task',
'base_log_folder': os.path.expanduser(BASE_LOG_FOLDER),
's3_log_folder': S3_LOG_FOLDER,
'filename_template': FILENAME_TEMPLATE,
},
},
'loggers': {
'': {
'handlers': ['console'],
'level': LOG_LEVEL
},
'airflow': {
'handlers': ['console'],
'level': LOG_LEVEL,
'propagate': False,
},
'airflow.processor': {
'handlers': ['file.processor'],
'level': LOG_LEVEL,
'propagate': True,
},
'airflow.task': {
'handlers': ['s3.task'],
'level': LOG_LEVEL,
'propagate': False,
},
'airflow.task_runner': {
'handlers': ['s3.task'],
'level': LOG_LEVEL,
'propagate': True,
},
}
}
I can see the logs on S3 but when I navigate to the UI logs all I get is
*** Log file isn't local.
*** Fetching here: http://1eb84d89b723:8793/log/hermes_pull_double_click_click/hermes_pull_double_click_click/2018-02-26T11:22:00/1.log
*** Failed to fetch log file from worker. HTTPConnectionPool(host='1eb84d89b723', port=8793): Max retries exceeded with url: /log/hermes_pull_double_click_click/hermes_pull_double_click_click/2018-02-26T11:22:00/1.log (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fe6940fc048>: Failed to establish a new connection: [Errno -2] Name or service not known',))
I can see in the logs that its successfully importing the log_config.py (I included a init.py as well)
Can't see why its using the FileTaskHandler here instead of the S3 one
Any help would be great, thanks
In my scenario it wasn't airflow that was at fault here.
I was able to go to the gitter channel and talk to the guys there.
After putting print statements into the python code that was running I was able to catch an exception on this line of code.
https://github.com/apache/incubator-airflow/blob/4ce4faaeae7a76d97defcf9a9d3304ac9d78b9bd/airflow/utils/log/s3_task_handler.py#L119
The exception was a recusion max depth issue on the SSLContext, which after looking around on the web seemed to be coming from using some combination of gevent with unicorn.
https://github.com/gevent/gevent/issues/903
I switched this back to sync and had to change the AWS ELB Listener to TCP but after that the logs were working fine through the UI
Hope this helps others.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With