Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Restarting the airflow scheduler

I'm trying to get airflow working to better orchestrate an etl process. When I make changes to a dag in my dags folder, I often have to restart the scheduler with

airflow scheduler

before the changes are visible in the UI. I would like to run the scheduler as a daemon process with

airflow scheduler -D

but we I try to do so, I get a message saying

[2018-10-17 14:13:54,769] {jobs.py:580} ERROR - 
Cannot use more than 1 thread when using sqlite. Setting max_threads to 1

I think this error pops up because the scheduler is already running as a daemon. However, when I try to find out where the scheduler is being run with

lsof -i

I don't get any results.

Question: Why am I not able to restart the scheduler with airflow scheduler -D. Why does the scheduler restart with airflow webserver? How do I successfully kill the process that is preventing me to run airflow scheduler -D?

like image 476
Mr. President Avatar asked Oct 17 '18 12:10

Mr. President


People also ask

How do I know if the Airflow scheduler is running?

CLI Check for Scheduler BaseJob with information about the host and timestamp (heartbeat) at startup, and then updates it regularly. You can use this to check if the scheduler is working correctly. To do this, you can use the airflow jobs checks command. On failure, the command will exit with a non-zero error code.

How do I stop scheduled Airflow from restarting?

You need to clear out the airflow-scheduler. pid file at $AIRFLOW_HOME. The stale pid file from the daemon will prevent you to start another scheduler process.

How do I stop a scheduler Airflow and webserver?

If you run Airflow locally and start it with the two commands airflow scheduler and airflow webserver , then those processes will run in the foreground. So, simply hitting Ctrl-C for each of them should terminate them and all their child processes.

How do I start Airflow on web server?

Create a init script and use the command "daemon" to run this as service. Show activity on this post. You can use a ready-made AMI (namely, LightningFLow) from AWS Marketplace which provides Airflow services (webserver, scheduler, worker) which are enabled at startup.


2 Answers

Run ps aux | grep airflow and check if airflow webserver or airflow scheduler processes are running. If they are kill them and rerun using airflow scheduler -D

like image 170
kaxil Avatar answered Oct 14 '22 04:10

kaxil


You need to clear out the airflow-scheduler.pid file at $AIRFLOW_HOME. The stale pid file from the daemon will prevent you to start another scheduler process.

like image 1
Magic Draggn Avatar answered Oct 14 '22 03:10

Magic Draggn