Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

django: celery vs channels vs in-view

Assume a button on my page which when clicked fires off an ajax request to my api endpoint which then fetches data from a 3rd party site. Let's say this task takes around 2-5 seconds with a timeout at 5 seconds. What's the ideal way to do this:

  1. celery task.delay() in the api endpoint and return a url to poll every x intervals for the result.
  2. just do it in the view

All tutorials I've seen suggest the celery way, but that seems like a lot of machinery/overhead for a simple request with minimal processing. Is there some generally accepted threshold (seconds till completion, etc..) in which one would choose one over the other?

Then there is django-channels which seems like it would be ideal for this. But, on first glance, the distinguishing line between channels workers and celery tasks seems blurred. Can I replace celery with the channels workers and just use that for the above stated task? Would channels also handle my longer running tasks? What would be the advantages/drawbacks with channels (either with celery or replacing celery)?

Finally, which of the 3 (celery/channels/in-view) would be the recommended approach to the example scenario given?

like image 723
Verbal_Kint Avatar asked Dec 04 '25 09:12

Verbal_Kint


1 Answers

I am not an expert on channels but here we go.

Channels is an abstraction above WSGI (the new protocol is ASGI) which allows you to communicate over "abstract" channels. Sometimes you'll do HTTP, sometimes websockets, sometimes other stuff you can do pretty much any communication pattern.

Celery is constructed in a similar manner, it uses a message bus (sometimes a more complex broker mechanism depending on how you run it) to send work to a worker machine which can send back optional results.

Now which do you choose?

In the view

I would avoid this unless you have a view specifically designed for this purpose. You will need to make sure that your stack can handle long-lived connections (heroku's router will complain if it takes more than 30 seconds for example) or you will want to implement some long polling interface.

With Celery

You'll need to do all of the setup to get things online.

Having a task whose results you want will require a result backend and passing the task ID around.

You'll need to implement a view that can query celery to figure out where the task is in terms of completion, success, etc..

Eg.

# kick of the task somewhere

def create_task(request, *args, **kwargs):
    task_id = some_task.delay(param)
    return Response({'task_id': task_id})

urls.py

url(r'^/tasks/<task_id>/$', name='task-progress')

views.py

def task_progress_view(request, task_id):
    # get fancier here, this is just an example
    return Response(some_task.AsyncResult(task_id).state)

That is a really simplistic example but that should do as a starting point.

With channels

You'll need to set up a bus, get the required views together in essentially the same way as celery, only you'll need to still have a piece of code that fetches the data with some retry, timeout logic, etc..

What to choose

Celery will take care of the work part, you'll have to take care of the updates and informing your client. Channels would be a reasonable way of dealing with that back and fourth but you may not need it.

I would think about what else you need to do. Most applications require asynchronous work at some point because business logic often dictates it. If you plan on using websockets and the like but you don't want to break down your django app into services I would just bite the bullet and do both.

If you don't need more than one protocol of communication, just use celery and do the views.

like image 69
theWanderer4865 Avatar answered Dec 07 '25 05:12

theWanderer4865