Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Handling computationally intensive tasks in a Django webapp

I have a desktop app that I'm in the process of porting to a Django webapp. The app has some quite computationally intensive parts (using numpy, scipy and pandas, among other libraries). Obviously importing the computationally intensive code into the webapp and running it isn't a great idea, as this will force the client to wait for a response.

Therefore, you'd have to farm these tasks out to a background process that notifies the client (via AJAX, I guess) and/or stores the results in the database when it's complete.

You also don't want all these tasks running in simultaneously in the case of multiple concurrent users, since that is a great way to bring your server to its knees even with a small number of concurrent requests. Ideally, you want each instance of your webapp to put its tasks into a job queue, that then automagically runs them in an optimal way (based on number of cores, available memory, etc.).

Are there any good Python libraries to help resolve this sort of an issue? Are there general strategies that people use in these kinds of situations? Or is this just a matter of choosing a good batch scheduler and spawning a new Python interpreter for each process?

like image 598
Chinmay Kanchi Avatar asked Sep 09 '14 17:09

Chinmay Kanchi


People also ask

What is the use of celery in Django?

django-celery provides Celery integration for Django; Using the Django ORM and cache backend for storing results, autodiscovery of task modules for applications listed in INSTALLED_APPS, and more. Celery is a task queue/job queue based on distributed message passing.

What is asynchronous task in Django?

Django Rest Framework. When writing an application, asynchronous code allows you to speed up an application that has to deal with a high number of tasks simultaneously. With the Django 3.1 release, Django now supports async views, so if you are running ASGI, writing async specific tasks is now possible!


1 Answers

We developed a Django web app which does heavy computation(Each process will take 11 to 88 hours to complete on high end servers).

Celery: Celery is an asynchronous task queue/job queue based on distributed message passing. It is focused on real-time operation, but supports scheduling as well.

Celery offers

  • Run tasks asynchronously.
  • Distributed execution of expensive processes.
  • Periodic and/or scheduled tasks.
  • Retrying tasks if something goes wrong.

This is just the tip of iceberg. There are a hell lot of features celery offers. Take a look at documentation & FAQ.

You also need to design a very good canvas for workflow. For example, you don't want all tasks running simultaneously in the case of multiple concurrent users, since it is a resource consumption. Also you might want to schedule tasks based on users who are currently online.

Also you need very good database design, efficient algorithms and so on.

like image 122
Pandikunta Anand Reddy Avatar answered Oct 10 '22 17:10

Pandikunta Anand Reddy