I have a desktop app that I'm in the process of porting to a Django webapp. The app has some quite computationally intensive parts (using numpy, scipy and pandas, among other libraries). Obviously importing the computationally intensive code into the webapp and running it isn't a great idea, as this will force the client to wait for a response.
Therefore, you'd have to farm these tasks out to a background process that notifies the client (via AJAX, I guess) and/or stores the results in the database when it's complete.
You also don't want all these tasks running in simultaneously in the case of multiple concurrent users, since that is a great way to bring your server to its knees even with a small number of concurrent requests. Ideally, you want each instance of your webapp to put its tasks into a job queue, that then automagically runs them in an optimal way (based on number of cores, available memory, etc.).
Are there any good Python libraries to help resolve this sort of an issue? Are there general strategies that people use in these kinds of situations? Or is this just a matter of choosing a good batch scheduler and spawning a new Python interpreter for each process?
django-celery provides Celery integration for Django; Using the Django ORM and cache backend for storing results, autodiscovery of task modules for applications listed in INSTALLED_APPS, and more. Celery is a task queue/job queue based on distributed message passing.
Django Rest Framework. When writing an application, asynchronous code allows you to speed up an application that has to deal with a high number of tasks simultaneously. With the Django 3.1 release, Django now supports async views, so if you are running ASGI, writing async specific tasks is now possible!
We developed a Django web app which does heavy computation(Each process will take 11 to 88 hours to complete on high end servers).
Celery: Celery is an asynchronous task queue/job queue based on distributed message passing. It is focused on real-time operation, but supports scheduling as well.
Celery offers
This is just the tip of iceberg. There are a hell lot of features celery offers. Take a look at documentation & FAQ.
You also need to design a very good canvas for workflow. For example, you don't want all tasks running simultaneously in the case of multiple concurrent users, since it is a resource consumption. Also you might want to schedule tasks based on users who are currently online.
Also you need very good database design, efficient algorithms and so on.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With