I have configured my Airflow setup to run with systemd according to this. It was great for a couple of days but it has thrown some errors that I can't figure out how to fix. Running sudo systemctl start airflow-webserver.service
doesn't really do anything but running airflow webserver
works (however, using systemd is needed for our purposes).
To understand what's the error, I run sudo systemctl status airflow-webserver.service
, and it gives the following status and error:
Feb 20 18:54:43 ip-172-31-25-17.ec2.internal airflow[19660]: [2019-02-20 18:54:43,774] {models.py:258} INFO - Filling up the DagBag from /home/ec2-user/airflow/dags
Feb 20 18:54:43 ip-172-31-25-17.ec2.internal airflow[19660]: /home/ec2-user/airflow/dags/statcan_1410009501.py:33: SyntaxWarning: name 'pg_hook' is assigned to before global declaration
Feb 20 18:54:43 ip-172-31-25-17.ec2.internal airflow[19660]: global pg_hook
Feb 20 18:54:43 ip-172-31-25-17.ec2.internal airflow[19660]: /usr/lib/python2.7/site-packages/airflow/utils/helpers.py:346: DeprecationWarning: Importing 'PythonOperator' directly from 'airflow.operators' has been deprecated. Please import from 'airflow.operators.[operat...irely in Airflow 2.0.
Feb 20 18:54:43 ip-172-31-25-17.ec2.internal airflow[19660]: DeprecationWarning)
Feb 20 18:54:43 ip-172-31-25-17.ec2.internal airflow[19660]: /usr/lib/python2.7/site-packages/airflow/utils/helpers.py:346: DeprecationWarning: Importing 'BashOperator' directly from 'airflow.operators' has been deprecated. Please import from 'airflow.operators.[operator...irely in Airflow 2.0.
Feb 20 18:54:43 ip-172-31-25-17.ec2.internal airflow[19660]: DeprecationWarning)
Feb 20 18:54:44 ip-172-31-25-17.ec2.internal airflow[19660]: [2019-02-20 18:54:44,528] {settings.py:174} INFO - setting.configure_orm(): Using pool settings. pool_size=5, pool_recycle=1800
Feb 20 18:54:45 ip-172-31-25-17.ec2.internal airflow[19660]: [2019-02-20 18:54:45 +0000] [19733] [INFO] Starting gunicorn 19.9.0
Feb 20 18:54:45 ip-172-31-25-17.ec2.internal airflow[19660]: Error: /run/airflow doesn't exist. Can't create pidfile.
The scheduler seems to be working fine, as verified after running both systemctl status airflow-scheduler.service
and journalctl -f
.
Here's the setup of the following systemd files:
/usr/lib/systemd/system/airflow-webserver.service
[Unit]
Description=Airflow scheduler daemon
After=network.target postgresql.service mysql.service redis.service rabbitmq-server.service
Wants=postgresql.service mysql.service redis.service rabbitmq-server.service
[Service]
EnvironmentFile=/etc/sysconfig/airflow
User=ec2-user
Type=simple
ExecStart=/bin/airflow scheduler
Restart=always
RestartSec=5s
[Install]
WantedBy=multi-user.target
/etc/tmpfiles.d/airflow.conf
D /run/airflow 0755 airflow airflow
/etc/sysconfig/airflow
AIRFLOW_CONFIG= $AIRFLOW_HOME/airflow.cfg
AIRFLOW_HOME= /home/ec2-user/airflow
Prior to this error, I moved my airflow installation from root to home directory. Not sure if it would have affected my setup but putting it here in case it is relevant.
Can anyone provide any explanation for the error and how to fix it? I tried my best to configure systemd as closely as possible to what is instructed but maybe I'm missing something?
Edit 2:
Sorry, I pasted the wrong code. So this is my code for airflow-webserver.service
[Unit]
Description=Airflow webserver daemon
After=network.target postgresql.service mysql.service redis.service rabbitmq-server.service
Wants=postgresql.service mysql.service redis.service rabbitmq-server.service
[Service]
EnvironmentFile=/etc/sysconfig/airflow
User=ec2-user
Type=simple
ExecStart=/bin/airflow webserver --pid /run/airflow/webserver.pid
Restart=on-failure
RestartSec=5s
PrivateTmp=true
[Install]
WantedBy=multi-user.target
You can do start/stop/restart actions on an Airflow service and the commands used for each service are given below: Run sudo monit <action> scheduler for Airflow Scheduler. Run sudo monit <action> webserver for Airflow Webserver.
I encountered this issue too and was able to resolve the issue by providing runtime directory parameters under [Service]
in the airflow-webserver.service
unit file:
[Service]
RuntimeDirectory=airflow
RuntimeDirectoryMode=0775
I was not able to figure out how to get it to work with /etc/tmpfiles.d/airflow.conf
alone.
The config file /etc/tmpfiles.d/airflow.conf
is used by systemd-tmpfiles-setup
service at boot. So, a server restart should create the /run/airflow directory. It's not possible to just restart this service as per https://github.com/systemd/systemd/issues/8684.
As suggested at the above link, after copying airflow.conf
to /etc/tmpfiles.d/
, just run sudo systemd-tmpfiles --create
and /run/airflow
should get created.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With