I am using Gunicorn to serve my flask webapp. My web app sends requests to download huge files some more than 10GB, which takes a while to complete. I am streaming the output of the progress back to the webpage using a generator, so the connection is left open until the download is done. My problem is Gunicorn will timeout after a certain amount of seconds.
I configured the timeout to be longer like this:
/usr/bin/gunicorn -c /my/dir/to/app/gunicorn.conf -b 0.0.0.0:5000 wsgi --timeout 90
but I don't know how long it will take, so I have to keep changing this timeout if the downloaded file gets larger and larger.
I was wondering if there is a way to disable the timeout all together, or if there is another option to remedy long download times.
Worker timeouts By default, Gunicorn gracefully restarts a worker if hasn't completed any work within the last 30 seconds.
After 30 seconds (configurable with timeout ) of request processing, gunicorn master process sends SIGTERM to the worker process, to initiate a graceful restart. If worker does not shutdown during another 30 seconds (configurable with graceful_timeout ), master process sends SIGKILL .03-Apr-2017.
pkill gunicorn stops all gunicorn daemons. So if you are running multiple instances of gunicorn with different ports, try this shell script. ps ax | grep gunicorn | grep $Port shows the daemons with specific port. Save this answer.
USS measures the memory unique to a process; preload reduces this by allowing processes to share more memory.
The timeout setting you specify with Gunicorn is to basically release a connection and restart a worker. Gunicorn kills such idle workers and restarts them. [1]
If you are streaming back the response, then IMO, your worker shouldn't get knocked off and killed by the parent process. Note, a connection is idle if no data is sent or received by a host.
So now here is what you might want to try. These are my personal suggestions.
Use --threads
settings, and set it up to a greater than 1 value; in this way, your worker may not be sitting idle and could be serving other requests. [2]
Instead of specifying a timeout here, you could try providing a timeout in the request's header. For this, you need to understand the Keep-Alive
header. The Keep-Alive has a timeout
parameter. [3] and [4]
Use multi-part
download to speed up your downloading of the large file. For this, you need to break-down the download into chunks, and then you could issue parallel requests for downloading that large file. [5]
Since your objective seems to be that you want to stream back the progress of your download back on a webpage, so instead of keeping the connection alive and open, using polling technique to fetch the progress of your download. Poll after every, say, 250-400 ms to get an update. In this way, your system would be more robust on slow-network connections also, and scalable for arbitrary large files. The caveat is that you need to somehow maintain the information of how much file has been downloaded. I personally built a multi-part download manager in Scala, using Actor framework.
One more suggestion is that you might also want to try a library like Flask-SocketIO. And although this is not really a bidirectional communication, but the point is to ensure that the socket remains open to give back the progress update. [6]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With