gunicorn worker exits for every request

Tags:

I have a fresh installation of apache-airflow 1.8.2, started its webserver, and its gunicorn workers exit for every webpage request, leaving the request hang for around 30s while waiting for a new worker to spawn. Need help fixing this issue.

Details below

I've installed apache-airflow 1.8.2 and followed this guide. I started the webserver at port 8081.

Now when I visit the server using my browser, the response is very slow. I looked at the console output, and noticed that every time I load a webpage, it says "Worker existing", and then pauses for a long time and says "Booting worker".

After digging the source code I found out that these are gunicorn workers. I have no experience with gunicorn or airflow or Flask, so I don't know if this is the expected behavior, but I feel like it shouldn't. At least a webserver should not hang for half a minute for every webpage.

Console output:

---> Browser request
[2017-11-01 19:08:07 -0700] [14549] [INFO] Worker exiting (pid: 14549)
---> Hangs for 30s
[2017-11-01 19:08:37 -0700] [13316] [INFO] Handling signal: ttin
[2017-11-01 19:08:37 -0700] [14698] [INFO] Booting worker with pid: 14698
/Users/michael/Programs/clones/airflow/airflow/www/app.py:23: FlaskWTFDeprecationWarning: "flask_wtf.CsrfProtect" has been renamed to "CSRFProtect" and will be removed in 1.0.
  csrf = CsrfProtect()
/Users/michael/Programs/miaozhen/tests/airflow-test/lib/python3.6/site-packages/flask/exthook.py:71: ExtDeprecationWarning: Importing flask.ext.cache is deprecated, use flask_cache instead.
  .format(x=modname), ExtDeprecationWarning
127.0.0.1 - - [01/Nov/2017:19:08:37 -0700] "GET /admin/ HTTP/1.1" 200 95063 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/61.0.3163.100 Safari/537.36"
[2017-11-01 19:08:38,096] [14698] {models.py:168} INFO - Filling up the DagBag from /Users/michael/airflow/dags
---> other GET requests on the same webpage, skipped here for simplicity
[2017-11-01 19:08:39 -0700] [13316] [INFO] Handling signal: ttou

Now I'm running a source version of apache-airflow 1.8.2 (i.e. cloned the source, checked out the tag, and installed with pip install -e .) in a virtualenv. However I've also tried: running the pypi version (pip install apache-airflow) without virtualenv; running the source version without virtualenv. And the same problem exists for all installations, so these differences are irrelevant.

My Python installation is:

$ python -VV
Python 3.6.3 (default, Oct  4 2017, 06:09:38) 
[GCC 4.2.1 Compatible Apple LLVM 9.0.0 (clang-900.0.37)]

EDIT:

I tried installing & running apache-airflow on another machine (Ubuntu Linux 16.04 + Python 3.5), and there is no problem. I also asked another person who is on Mac with Python 3.6, and there is no problem either. I guess there's something weird with my machine... Any suggestion how I can debug this thing?

222

asked Nov 02 '17 02:11

Michael Kim

1 Answers

Workers regularly exiting from the signal ttou (means decrement # of processes by one) every so often is intentional. This is airflow periodically "refreshing" workers. Based on what I read in AIRFLOW-276, which added this feature, refreshing workers is to ensure they pickup on new or updated DAGs. This behavior can be modified in your airflow config under worker_refresh_interval and worker_refresh_batch_size.

From looking at the source, it spins new workers up before spinning down old workers, so I don't think this would cause a delay for your requests. However, you can try disabling it with worker_refresh_batch_size = 0.

110

answered Oct 08 '22 15:10

Daniel Huang

Related questions
                            
                                Azure Function - Python - ServiceBus Output Binding - Setting Custom Properties
                            
                                Issues using the scipy.sparse.linalg linear system solvers
                            
                                Cant create conda env in dockerfile
                            
                                numpy how to slice index an array using arrays?
                            
                                How to render a template and send a file simultaneously with flask [duplicate]
                            
                                Execute Highlighted Code in Jupyter notebook Cell?
                            
                                Python mongoengine - retrieve _id after saving
                            
                                Pandas: Count time interval intersections over a group by
                            
                                Why use numpy over list based on speed?
                            
                                same url_prefix for two flask blueprints
                            
                                Check for common element in three lists: it checks for identical lists instead
                            
                                Pandas dataframe add integer columns into datetime columns
                            
                                close() not releasing memory after matplotlib savefig
                            
                                use tuple as index in pandas Series
                            
                                ModelChoiceField gives “Select a valid choice” populating select with ajax call
                            
                                How to tell whether a function is being called from a jupyter notebook or not?
                            
                                Fully expanding $ref references in a json schema with Python
                            
                                Type annotations for Django models
                            
                                Reading *.mhd/*.raw format in python
                            
                                Optional get parameters in django?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

gunicorn worker exits for every request

Tags:

python

gunicorn

airflow

apache-airflow

Details below

Michael Kim

People also ask

1 Answers

Daniel Huang

Recent Activity

Donate For Us