I am building REST API with Flask-restplus. One of my endpoints takes a file uploaded from client and run some analysis. The job uses up to 30 seconds. I don't want the job to block the main process. So the endpoint will return a response with 200 or 201 right away, the job can still be running. Results will be saved to database which will be retrieved later.
It seems I have two options for long-running jobs.
Threading is relatively simpler. But problem is, there is a limit of thread numbers for Flask app. In a standalone Python app, I could use a queue for the threads. But this is REST api, each request call is independent. I don't know if there is a way to maintain a global queue for that. So if the requests exceed the thread limit, it won't be able to take more requests.
Task-queue with Celery and Redis is probably better option. But this is just a proof of concept thing, and time line is kind of tight. Setting up Celery, Redis with Flask is not easy, I am having lots of trouble on my dev machine which is a Windows. It will be deployed on AWS which is kind of complex.
I wonder if there is a third option for this case?
Make sure, you close the terminal and not press Ctrl + C. This will allow it to run in background even when you log out. To stop it from running , ssh in to the pi again and run ps -ef |grep nohup and kill -9 XXXXX where XXXX is the pid you will get ps command.
To keep running the application (so that you may close your laptop and have some fun, while the application keeps running), we will use the powerful linux command: nohup (no hangup). We can see after the nohup command the 'process id' : 21108 of the running process.
Typically, these jobs are created via message queues and performed by worker nodes. Upon detecting an event, a message will be pushed to the queues while the worker nodes keep listening to and process one by one. Some examples of these jobs include: Sending an email to activate new users.
As of Flask 1.0, flask server is multi-threaded by default. Each new request is handled in a new thread. This is a simple Flask application using default settings.
Celery is a fantastic solution to this problem I have used quite successfully in the past to manage millions of jobs per day.
The only real downside is the initial learning curve and complexity of debugging when things go sour (it can happen, especially with millions of jobs).
I would HIGHLY recommend using Celery as you have already mentioned in your post. It is built exactly for this use case. Their docs are really informative and there are no shortage of examples online that can get you up and running quickly.
Additionally, I would say THIS would be an excellent first resource for you to start with.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With