Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Can you perform multi-threaded tasks within Django?

The sequence I would like to accomplish:

  1. A user clicks a button on a web page
  2. Some functions in model.py start to run. For example, gathering some data by crawling the internet
  3. When the functions are finished, the results are returned to the user.

Should I open a new thread inside of model.py to execute my functions? If so, how do I do this?

like image 431
Robert Avatar asked Jul 11 '13 19:07

Robert


People also ask

Is Django single threaded or multithreaded?

Django itself does not determine whether it runs in one or more threads. This is the job of the server running Django. The development server used to be single-threaded, but in recent versions it has been made multithreaded.

Does Django use multiprocessing?

Essentially Django serves WSGI request-response cycle which knows nothing of multiprocessing or background tasks.

Is multithreading possible in Python?

Python doesn't support multi-threading because Python on the Cpython interpreter does not support true multi-core execution via multithreading. However, Python does have a threading library. The GIL does not prevent threading.

Is Django development server multithreaded?

The Django team has declared that they will not offer a multi-threaded development server, for good or bad, so we are left to our own devices.


2 Answers

As shown in this answer you can use the threading package to perform an asynchronous task. Everyone seems to recommend Celery, but it is often overkill for performing simple but long running tasks. I think it's actually easier and more transparent to use threading.

Here's a simple example for asyncing a crawler:

#views.py
import threading
from .models import Crawl

def startCrawl(request):
    task = Crawl()
    task.save()
    t = threading.Thread(target=doCrawl,args=[task.id])
    t.setDaemon(True)
    t.start()
    return JsonResponse({'id':task.id})

def checkCrawl(request,id):
    task = Crawl.objects.get(pk=id)
    return JsonResponse({'is_done':task.is_done, result:task.result})

def doCrawl(id):
    task = Crawl.objects.get(pk=id)
    # Do crawling, etc.

    task.result = result
    task.is_done = True
    task.save()

Your front end can make a request for startCrawl to start the crawl, it can make an Ajax request to check on it with checkCrawl which will return true and the result when it's finished.


Update for Python3:

The documentation for the threading library recommends passing the daemon property as a keyword argument rather than using the setter:

t = threading.Thread(target=doCrawl,args=[task.id],daemon=True)
t.start()

Update for Python <3.7:

As discussed here, this bug can cause a slow memory leak that can overflow a long running server. The bug was fixed for Python 3.7 and above.

like image 101
nbwoodward Avatar answered Oct 14 '22 17:10

nbwoodward


  1. Yes it can multi-thread, but generally one uses Celery to do the equivalent. You can read about how in the celery-django tutorial.
  2. It is rare that you actually want to force the user to wait for the website. While it's better than risks a timeout.

Here's an example of what you're describing.

User sends request
Django receives => spawns a thread to do something else.
main thread finishes && other thread finishes 
... (later upon completion of both tasks)
response is sent to user as a package.

Better way:

User sends request
Django receives => lets Celery know "hey! do this!"
main thread finishes
response is sent to user
...(later)
user receives balance of transaction 
like image 42
cwallenpoole Avatar answered Oct 14 '22 16:10

cwallenpoole