I've been pulling my hair out trying to figure this one out, hoping someone else has already encountered this and knows how to solve it :) I'm trying to build a very simple Flask endpoint that just needs to call a long running, blocking <code>php</code> script (think <code>while true {...}</code>). I've tried a few different methods to async launch the script, but the problem is my browser never actually receives the response back, even though the code for generating the response after running the script is executed. I've tried using both <code>multiprocessing</code> and <code>threading</code>, neither seem to work: <pre class="prettyprint"><code># multiprocessing attempt @app.route('/endpoint') def endpoint(): def worker(): subprocess.Popen('nohup php script.php &', shell=True, preexec_fn=os.setpgrp) p = multiprocessing.Process(target=worker) print '111111' p.start() print '222222' return json.dumps({ 'success': True }) # threading attempt @app.route('/endpoint') def endpoint(): def thread_func(): subprocess.Popen('nohup php script.php &', shell=True, preexec_fn=os.setpgrp) t = threading.Thread(target=thread_func) print '111111' t.start() print '222222' return json.dumps({ 'success': True }) </code></pre> In both scenarios I see the <code>111111</code> and <code>222222</code>, yet my browser still hangs on the response from the endpoint. I've tried <code>p.daemon = True</code> for the process, as well as <code>p.terminate()</code> but no luck. I had hoped launching a script with nohup in a different shell and separate processs/thread would just work, but somehow Flask or uWSGI is impacted by it. <h3>Update</h3> Since this does work locally on my Mac when I start my Flask app directly with <code>python app.py</code> and hit it directly without going through my Nginx proxy and uWSGI, I'm starting to believe it may not be the code itself that is having issues. And because my Nginx just forwards the request to uWSGI, I believe it may possibly be something there that's causing it. Here is my ini configuration for the domain for uWSGI, which I'm running in emperor mode: <pre class="prettyprint"><code>[uwsgi] protocol = uwsgi max-requests = 5000 chmod-socket = 660 master = True vacuum = True enable-threads = True auto-procname = True procname-prefix = michael- chdir = /srv/www/mysite.com module = app callable = app socket = /tmp/mysite.com.sock </code></pre>

This kind of stuff is the actual and probably main use case for <code>Python Celery</code> (https://docs.celeryproject.org/). As a general rule, do not run long-running jobs that are CPU-bound in the <code>wsgi</code> process. It's tricky, it's inefficient, and most important thing, it's more complicated than setting up an async task in a celery worker. If you want to just prototype you can set the broker to <code>memory</code> and not using an external server, or run a single-threaded <code>redis</code> on the very same machine. This way you can launch the task, call <code>task.result()</code> which is blocking, but it blocks in an IO-bound fashion, or even better you can just return immediately by retrieving the <code>task_id</code> and build a second endpoint <code>/result?task_id=<task_id></code> that checks if result is available: <pre class="prettyprint"><code>result = AsyncResult(task_id, app=app) if result.state == "SUCCESS": return result.get() else: return result.state # or do something else depending on the state </code></pre> This way you have a non-blocking <code>wsgi</code> app that does what is best suited for: short time CPU-unbound calls that have IO calls at most with OS-level scheduling, then you can rely directly to the <code>wsgi</code> server <code>workers|processes|threads</code> or whatever you need to scale the API in whatever wsgi-server like uwsgi, gunicorn, etc. for the 99% of workloads as celery scales horizontally by increasing the number of worker processes.

Long running script from flask endpoint

Tags:

python

flask

python-2.7

uwsgi

I've been pulling my hair out trying to figure this one out, hoping someone else has already encountered this and knows how to solve it :)

I'm trying to build a very simple Flask endpoint that just needs to call a long running, blocking php script (think while true {...}). I've tried a few different methods to async launch the script, but the problem is my browser never actually receives the response back, even though the code for generating the response after running the script is executed.

I've tried using both multiprocessing and threading, neither seem to work:

# multiprocessing attempt
@app.route('/endpoint')
def endpoint():
  def worker():
    subprocess.Popen('nohup php script.php &', shell=True, preexec_fn=os.setpgrp)

  p = multiprocessing.Process(target=worker)
  print '111111'
  p.start()
  print '222222'
  return json.dumps({
    'success': True
  })

# threading attempt
@app.route('/endpoint')
def endpoint():
  def thread_func():
    subprocess.Popen('nohup php script.php &', shell=True, preexec_fn=os.setpgrp)

  t = threading.Thread(target=thread_func)
  print '111111'
  t.start()
  print '222222'
  return json.dumps({
    'success': True
  })

In both scenarios I see the 111111 and 222222, yet my browser still hangs on the response from the endpoint. I've tried p.daemon = True for the process, as well as p.terminate() but no luck. I had hoped launching a script with nohup in a different shell and separate processs/thread would just work, but somehow Flask or uWSGI is impacted by it.

Update

Since this does work locally on my Mac when I start my Flask app directly with python app.py and hit it directly without going through my Nginx proxy and uWSGI, I'm starting to believe it may not be the code itself that is having issues. And because my Nginx just forwards the request to uWSGI, I believe it may possibly be something there that's causing it.

Here is my ini configuration for the domain for uWSGI, which I'm running in emperor mode:

[uwsgi]
protocol = uwsgi
max-requests = 5000
chmod-socket = 660
master = True
vacuum = True
enable-threads = True
auto-procname = True
procname-prefix = michael-
chdir = /srv/www/mysite.com
module = app
callable = app
socket = /tmp/mysite.com.sock

351

asked Sep 19 '18 03:09

smaili

1 Answers

This kind of stuff is the actual and probably main use case for Python Celery (https://docs.celeryproject.org/). As a general rule, do not run long-running jobs that are CPU-bound in the wsgi process. It's tricky, it's inefficient, and most important thing, it's more complicated than setting up an async task in a celery worker. If you want to just prototype you can set the broker to memory and not using an external server, or run a single-threaded redis on the very same machine.

This way you can launch the task, call task.result() which is blocking, but it blocks in an IO-bound fashion, or even better you can just return immediately by retrieving the task_id and build a second endpoint /result?task_id=<task_id> that checks if result is available:

result = AsyncResult(task_id, app=app)
if result.state == "SUCCESS":
   return result.get()
else:
   return result.state  # or do something else depending on the state

This way you have a non-blocking wsgi app that does what is best suited for: short time CPU-unbound calls that have IO calls at most with OS-level scheduling, then you can rely directly to the wsgi server workers|processes|threads or whatever you need to scale the API in whatever wsgi-server like uwsgi, gunicorn, etc. for the 99% of workloads as celery scales horizontally by increasing the number of worker processes.

answered Oct 05 '22 01:10

danius

Related questions
                            
                                pytest with setup.py test
                            
                                Extracting one-hot vector from text
                            
                                How to live with both enum and enum34?
                            
                                Python multiprocessing - AssertionError: can only join a child process
                            
                                Process request thread error with Flask Application?
                            
                                Windows alternative to pexpect
                            
                                Performance between C-contiguous and Fortran-contiguous array operations
                            
                                Specifying Readonly access for Django.db connection object
                            
                                Out of bounds nanosecond timestamp
                            
                                Authentication only via config file?
                            
                                Tensorflow installation using SSE instructions with pip
                            
                                Show image without scaling
                            
                                How to include a shared C library in a Python package
                            
                                Structuring python projects without path hacks
                            
                                Annotate with django-graphene and filters
                            
                                Python subprocess in .exe
                            
                                Creating a dylib file on MacOS for use with Python wrapper of Steamworks API
                            
                                Spark 2.3 Memory Leak on Executor
                            
                                Unable to exhaust the content of all the identical urls used within my scraper
                            
                                How can I write to a png/tiff file patch-by-patch?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With