Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Django: Should I kick off a separate process?

Tags:

process

django

I'm writing an app that will allow the user to upload data in a file; the app will process this data, and email the results to the user. Processing may take some time, so I would like to handle this separately in a Python script rather than wait in the view for it to complete. The Python script and view don't need to communicate as the script will pick up the data from a file written by the view. The view will just put up a message like "Thanks for uploading your data - the results will be emailed to you"

What's the best way to do this in Django? Spawn off a separate process? Put something on a queue?

Some example code would be greatly appreciated. Thanks.

like image 551
FunLovinCoder Avatar asked Nov 27 '10 13:11

FunLovinCoder


2 Answers

The simplest possible solution is to write a custom commands that searches for all the un-processed files, processes them and then emails the user. The management commands runs inside the Django framework so they have access to all models, db connections, etc, but you can call them from wherever, for example crontab.

If you care about the timeframe between the file has been uploaded and processing starts, you could use a framework like Celery, which is basically a helper library for using a message queue and running workers listening in on the queue. This would be pretty low latency, but on the other hand, simplicity might be more important for you.

I would strongly advice against starting threads or spawning processes in your views, as the threads would be running inside the django process and could destroy your webserver(depending on your configuration). The child process would inherit everything from the Django process, which you probably don't want. It is better to keep this stuff separate.

like image 171
knutin Avatar answered Oct 15 '22 15:10

knutin


I currently have a project with similar requirements (just more complicated^^).

Never spawn a subprocess or thread from your Django view. You have no control of the Django processes and it could be killed, paused etc before the end of the task. It is controlled by the web server (e.g. apache via WSGI).

What I would do is an external script, which would run in a separate process. You have two solutions I think :

  • A process that is always running and crawling the directory where you put your files. It would for example check the directory every ten seconds and process the files
  • Same as above, but run by cron every x seconds. This basically has the same effect
  • Use Celery to create worker processes and add jobs to the queue with your Django application. Then you will need to get the results back by one of the means available with Celery.

Now you probably need to access the information in Django models to email the user in the end. Here you have several solutions :

  • Import your modules (models etc) from the external script
  • Implement the external script as a custom command (as knutin suggested)
  • Communicate the results to the Django application via a POST request for example. Then you would do the email sending and status changes etc in a normal Django view.

I would go for an external process and import the modules or POST request. This way it is much more flexible. You could for example make use of the multiprocessing module to process several files in the same time (thus using multi-core machines efficiently).

A basic workflow would be:

  1. Check the directory for new files
  2. For each file (can be parallelized):
    1. Process
    2. Send email or notify your Django application
  3. Sleep for a while

My project contains really CPU-demanding processing. I currently use an external process that gives processing jobs to a pool of worker processes (that's basically what Celery could do for you) and reports the progress and results back to the Django application via POST requests. It works really well and is relatively scalable, but I will soon change it to use Celery on a cluster.

like image 31
Marc Demierre Avatar answered Oct 15 '22 15:10

Marc Demierre