I have a super simple test DAG that looks like this:
from datetime import datetime
from airflow.models import DAG
from airflow.operators.python_operator import PythonOperator
DAG = DAG(
dag_id='scheduler_test_dag',
start_date=datetime(2017, 9, 9, 4, 0, 0, 0), #..EC2 time. Equal to 11pm hora México
max_active_runs=1,
schedule_interval='@once' #externally triggered
)
def ticker_function():
with open('/tmp/ticker', 'a') as outfile:
outfile.write('{}\n'.format(datetime.now()))
time_ticker = PythonOperator(
task_id='time_ticker',
python_callable=ticker_function,
dag=DAG
)
Since upgrading to apache-airflow
v1.9 this DAG is hung and won't run. Digging into the scheduler logs I found the error trace:
[2018-02-12 17:03:06,259] {jobs.py:1754} INFO - DAG(s) dict_keys(['scheduler_test_dag']) retrieved from /home/ubuntu/airflow/dags/scheduler_test_dag.py
[2018-02-12 17:03:06,315] {jobs.py:1386} INFO - Processing scheduler_test_dag
[2018-02-12 17:03:06,320] {jobs.py:379} ERROR - Got an exception! Propagating...
Traceback (most recent call last):
File "/usr/local/lib/python3.5/dist-packages/airflow/jobs.py", line 371, in helper
pickle_dags)
File "/usr/local/lib/python3.5/dist-packages/airflow/utils/db.py", line 50, in wrapper
result = func(*args, **kwargs)
File "/usr/local/lib/python3.5/dist-packages/airflow/jobs.py", line 1792, in process_file
self._process_dags(dagbag, dags, ti_keys_to_schedule)
File "/usr/local/lib/python3.5/dist-packages/airflow/jobs.py", line 1388, in _process_dags
dag_run = self.create_dag_run(dag)
File "/usr/local/lib/python3.5/dist-packages/airflow/utils/db.py", line 50, in wrapper
result = func(*args, **kwargs)
File "/usr/local/lib/python3.5/dist-packages/airflow/jobs.py", line 807, in create_dag_run
if next_start <= now:
TypeError: unorderable types: NoneType() <= datetime.datetime()
Where is this error coming from? The only thing that I can think of is that the usage of scheduler_interval='@once'
has changed, which is the one thing that this DAG has in common with one other broken DAG on my server since the v1.9 upgrade. Otherwise it's the most basic DAG ever--doesn't seem like there should be a problem. Previously I was using the basic pip install before switching to the apache-airflow
repo.
Here's a screenshot of the Web UI. Everything seems to be working alright, except the top and bottom DAGS which have scheduling interval set to @once
and are indefinitely hung:
Any thoughts?
An Airflow DAG with a start_date , possibly an end_date , and a schedule_interval defines a series of intervals which the scheduler turn into individual Dag Runs and execute.
The Airflow scheduler is designed to run as a persistent service in an Airflow production environment. To kick it off, all you need to do is execute the airflow scheduler command. It uses the configuration specified in airflow. cfg .
You can do start/stop/restart actions on an Airflow service and the commands used for each service are given below: Run sudo monit <action> scheduler for Airflow Scheduler. Run sudo monit <action> webserver for Airflow Webserver.
CLI Check for Scheduler BaseJob with information about the host and timestamp (heartbeat) at startup, and then updates it regularly. You can use this to check if the scheduler is working correctly. To do this, you can use the airflow jobs checks command. On failure, the command will exit with a non-zero error code.
Have you defined catch up as True in your airflow.cfg? Then this is fixed in master. Disable catchup for this dag and it should start working.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With