Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Loading an eventstream through Gunicorn + Flask

I'm trying to generate a large PDF using a Flask application. The pdf generation involves generating ten long pdfs, and then merging them together. The application runs using Gunicorn with the flags: --worker-class gevent --workers 2.

Here's what my server-side code looks like:

@app.route ('/pdf/create', methods=['POST', 'GET'])
def create_pdf():
    def generate():
        for section in pdfs:
            yield "data: Generating %s pdf\n\n" % section
            # Generate pdf with pisa (takes up to 2 minutes)

        yield "data:  Merging PDFs\n\n"
        # Merge pdfs (takes up to 2 minutes)
        yield "data: /user/pdf_filename.pdf\n\n"

    return Response(stream_with_context(generate()), mimetype='text/event-stream')

The client side code looks like:

var source = new EventSource(create_pdf_url);
source.onopen = function (event) {
  console.log("Creating PDF")
}
source.onmessage = function (event) {
    console.log(event.data);
}
source.onerror = function (event) {
    console.log("ERROR");
}

When I run without GUnicorn, I get provided with steady, real-time updates from the console log. They look like:

Creating PDF
Generating section one
Generating section two
Generating section three
...
Generating section ten
Merging PDFS
/user/pdf_filename.pdf

When I run this code with Gunicorn, I don't get regular updates. The worker runs until Gunicorn's timeout kills it, then I get a dump of all the messages that should've happened, followed by a final error

Creating PDF
Generating section one
Generating section two
ERROR

The Gunicorn log looks like:

[2015-03-19 21:57:27 +0000] [3163] [CRITICAL] WORKER TIMEOUT (pid:3174)

How can I keep Gunicorn from killing the process? I don't think setting a super-large timeout is a good idea. Perhaps there's something in gunicorn's worker classes that I can use to make sure the process is handled correctly?

like image 997
Adam Steele Avatar asked Mar 20 '15 16:03

Adam Steele


People also ask

Can Gunicorn handle multiple requests?

Gunicorn should only need 4-12 worker processes to handle hundreds or thousands of requests per second.

Is Gunicorn multithreaded?

Gunicorn also allows for each of the workers to have multiple threads. In this case, the Python application is loaded once per worker, and each of the threads spawned by the same worker shares the same memory space.


1 Answers

I ended up solving the problem using Celery.

I used this example to guide me in setting up Celery.

Then I used Grinberg's Celery tutorial to stream real-time updates to the user's browser.

like image 87
Adam Steele Avatar answered Sep 27 '22 18:09

Adam Steele