Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

airflow sending sigterms to tasks randomly

I was running into an issue with airflow 1.10.1. Some of the tasks in the dags are getting SIGTERM from helpers.py, from what I understood this is to perform shutdown for the workers and terminate all child processes but I see this in around 2-3 tasks only out of a 10 dag taks and the task which recieves the signal changes upon running the dag again. Is there a certain criteria to send these SIGTERM signals. Logs for a task which recieved SIGTERM:

[2019-12-10 11:13:44,530] {base_task_runner.py:101} INFO - Job 25404: Subtask BS_PMU2 [2019-12-10 11:13:44,520] {settings.py:174} INFO - setting.configure_orm(): Using pool settings. pool_size=5, pool_recycle=3600
[2019-12-10 11:13:45,489] {base_task_runner.py:101} INFO - Job 25404: Subtask BS_PMU2 [2019-12-10 11:13:45,488] {__init__.py:51} INFO - Using executor CeleryExecutor
[2019-12-10 11:13:45,934] {base_task_runner.py:101} INFO - Job 25404: Subtask BS_PMU2 [2019-12-10 11:13:45,933] {models.py:271} INFO - Filling up the DagBag from /home/centos/airflow/dags/61b6c300e82643b0f294df6f.py
[2019-12-10 11:13:46,580] {base_task_runner.py:101} INFO - Job 25404: Subtask BS_PMU2 Connected to MongoDB...
[2019-12-10 11:13:47,510] {bash_operator.py:74} INFO - Tmp dir root location:
/tmp
[2019-12-10 11:13:47,510] {bash_operator.py:87} INFO - Temporary script location: /tmp/airflowtmpal71kawr/BS_PMU2rjty_k9l
[2019-12-10 11:13:47,511] {bash_operator.py:97} INFO - Running command:
[2019-12-10 11:13:47,542] {bash_operator.py:106} INFO - Output:
[2019-12-10 11:13:47,542] {bash_operator.py:114} INFO - Command exited with return code 0
[2019-12-10 11:13:57,559] {base_task_runner.py:101} INFO - Job 25404: Subtask BS_PMU2 2019-12-10 11:13:57,556 - root - INFO - Putting xcom with return value:
[2019-12-10 11:13:57,631] {base_task_runner.py:101} INFO - Job 25404: Subtask BS_PMU2 2019-12-10 11:13:57,625 - root - INFO - WorkflowID: 61b6c300e82643b0f294df6f, RunID: 456c5bfb16556a3adc3b251a, TaskID: BS_PMU2
[2019-12-10 11:13:57,652] {base_task_runner.py:101} INFO - Job 25404: Subtask BS_PMU2 2019-12-10 11:13:57,643 - root - ERROR - Invalid key/value. Will skip setting xcom.
[2019-12-10 11:13:57,652] {base_task_runner.py:101} INFO - Job 25404: Subtask BS_PMU2 2019-12-10 11:13:57,644 - root - INFO - Done Execute
[2019-12-10 11:13:58,663] {helpers.py:240} INFO - Sending Signals.SIGTERM to GPID 9696
[2019-12-10 11:13:58,674] {helpers.py:230} INFO - Process psutil.Process(pid=9696 (terminated)) (9696) terminated with exit code 15```
like image 675
Saurabh Sinha Avatar asked Dec 12 '19 06:12

Saurabh Sinha


1 Answers

You can try increasing the value of AIRFLOW__CORE__KILLED_TASK_CLEANUP_TIME in your airflow configuration if you want to stick to same version of airflow.

Upgrading your airflow version >= 2.X will also help.

You can look up the documentation for more reference: https://airflow.apache.org/docs/apache-airflow/stable/configurations-ref.html#killed-task-cleanup-time

like image 53
Akhil Ghatiki Avatar answered Oct 07 '22 11:10

Akhil Ghatiki