Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

"Error: /run/airflow doesn't exist. Can't create pidfile." when using systemd for Airflow webserver

I have configured my Airflow setup to run with systemd according to this. It was great for a couple of days but it has thrown some errors that I can't figure out how to fix. Running sudo systemctl start airflow-webserver.service doesn't really do anything but running airflow webserver works (however, using systemd is needed for our purposes).

To understand what's the error, I run sudo systemctl status airflow-webserver.service, and it gives the following status and error:

Feb 20 18:54:43 ip-172-31-25-17.ec2.internal airflow[19660]: [2019-02-20 18:54:43,774] {models.py:258} INFO - Filling up the DagBag from /home/ec2-user/airflow/dags
Feb 20 18:54:43 ip-172-31-25-17.ec2.internal airflow[19660]: /home/ec2-user/airflow/dags/statcan_1410009501.py:33: SyntaxWarning: name 'pg_hook' is assigned to before global declaration
Feb 20 18:54:43 ip-172-31-25-17.ec2.internal airflow[19660]: global pg_hook
Feb 20 18:54:43 ip-172-31-25-17.ec2.internal airflow[19660]: /usr/lib/python2.7/site-packages/airflow/utils/helpers.py:346: DeprecationWarning: Importing 'PythonOperator' directly from 'airflow.operators' has been deprecated. Please import from 'airflow.operators.[operat...irely in Airflow 2.0.
Feb 20 18:54:43 ip-172-31-25-17.ec2.internal airflow[19660]: DeprecationWarning)
Feb 20 18:54:43 ip-172-31-25-17.ec2.internal airflow[19660]: /usr/lib/python2.7/site-packages/airflow/utils/helpers.py:346: DeprecationWarning: Importing 'BashOperator' directly from 'airflow.operators' has been deprecated. Please import from 'airflow.operators.[operator...irely in Airflow 2.0.
Feb 20 18:54:43 ip-172-31-25-17.ec2.internal airflow[19660]: DeprecationWarning)
Feb 20 18:54:44 ip-172-31-25-17.ec2.internal airflow[19660]: [2019-02-20 18:54:44,528] {settings.py:174} INFO - setting.configure_orm(): Using pool settings. pool_size=5, pool_recycle=1800
Feb 20 18:54:45 ip-172-31-25-17.ec2.internal airflow[19660]: [2019-02-20 18:54:45 +0000] [19733] [INFO] Starting gunicorn 19.9.0
Feb 20 18:54:45 ip-172-31-25-17.ec2.internal airflow[19660]: Error: /run/airflow doesn't exist. Can't create pidfile.

The scheduler seems to be working fine, as verified after running both systemctl status airflow-scheduler.service and journalctl -f.

Here's the setup of the following systemd files:

/usr/lib/systemd/system/airflow-webserver.service

[Unit]
Description=Airflow scheduler daemon
After=network.target postgresql.service mysql.service redis.service rabbitmq-server.service
Wants=postgresql.service mysql.service redis.service rabbitmq-server.service

[Service]
EnvironmentFile=/etc/sysconfig/airflow
User=ec2-user
Type=simple
ExecStart=/bin/airflow scheduler
Restart=always
RestartSec=5s

[Install]
WantedBy=multi-user.target

/etc/tmpfiles.d/airflow.conf

D /run/airflow 0755 airflow airflow

/etc/sysconfig/airflow

AIRFLOW_CONFIG= $AIRFLOW_HOME/airflow.cfg
AIRFLOW_HOME= /home/ec2-user/airflow

Prior to this error, I moved my airflow installation from root to home directory. Not sure if it would have affected my setup but putting it here in case it is relevant.

Can anyone provide any explanation for the error and how to fix it? I tried my best to configure systemd as closely as possible to what is instructed but maybe I'm missing something?

Edit 2:

Sorry, I pasted the wrong code. So this is my code for airflow-webserver.service

[Unit]
Description=Airflow webserver daemon
After=network.target postgresql.service mysql.service redis.service rabbitmq-server.service
Wants=postgresql.service mysql.service redis.service rabbitmq-server.service

[Service]
EnvironmentFile=/etc/sysconfig/airflow
User=ec2-user
Type=simple
ExecStart=/bin/airflow webserver --pid /run/airflow/webserver.pid
Restart=on-failure
RestartSec=5s
PrivateTmp=true

[Install]
WantedBy=multi-user.target
like image 876
Czarina Catambing Avatar asked Feb 20 '19 19:02

Czarina Catambing


People also ask

How do I restart the airflow scheduler?

You can do start/stop/restart actions on an Airflow service and the commands used for each service are given below: Run sudo monit <action> scheduler for Airflow Scheduler. Run sudo monit <action> webserver for Airflow Webserver.


2 Answers

I encountered this issue too and was able to resolve the issue by providing runtime directory parameters under [Service] in the airflow-webserver.service unit file:

[Service]
RuntimeDirectory=airflow
RuntimeDirectoryMode=0775

I was not able to figure out how to get it to work with /etc/tmpfiles.d/airflow.conf alone.

like image 111
dstandish Avatar answered Sep 18 '22 14:09

dstandish


The config file /etc/tmpfiles.d/airflow.conf is used by systemd-tmpfiles-setup service at boot. So, a server restart should create the /run/airflow directory. It's not possible to just restart this service as per https://github.com/systemd/systemd/issues/8684.

As suggested at the above link, after copying airflow.conf to /etc/tmpfiles.d/, just run sudo systemd-tmpfiles --create and /run/airflow should get created.

like image 31
rubpa Avatar answered Sep 22 '22 14:09

rubpa