Assume a button on my page which when clicked fires off an ajax request to my api endpoint which then fetches data from a 3rd party site. Let's say this task takes around 2-5 seconds with a timeout at 5 seconds. What's the ideal way to do this:
All tutorials I've seen suggest the celery way, but that seems like a lot of machinery/overhead for a simple request with minimal processing. Is there some generally accepted threshold (seconds till completion, etc..) in which one would choose one over the other?
Then there is django-channels which seems like it would be ideal for this. But, on first glance, the distinguishing line between channels workers and celery tasks seems blurred. Can I replace celery with the channels workers and just use that for the above stated task? Would channels also handle my longer running tasks? What would be the advantages/drawbacks with channels (either with celery or replacing celery)?
Finally, which of the 3 (celery/channels/in-view) would be the recommended approach to the example scenario given?
I am not an expert on channels but here we go.
Channels is an abstraction above WSGI (the new protocol is ASGI) which allows you to communicate over "abstract" channels. Sometimes you'll do HTTP, sometimes websockets, sometimes other stuff you can do pretty much any communication pattern.
Celery is constructed in a similar manner, it uses a message bus (sometimes a more complex broker mechanism depending on how you run it) to send work to a worker machine which can send back optional results.
Now which do you choose?
I would avoid this unless you have a view specifically designed for this purpose. You will need to make sure that your stack can handle long-lived connections (heroku's router will complain if it takes more than 30 seconds for example) or you will want to implement some long polling interface.
You'll need to do all of the setup to get things online.
Having a task whose results you want will require a result backend and passing the task ID around.
You'll need to implement a view that can query celery to figure out where the task is in terms of completion, success, etc..
Eg.
# kick of the task somewhere
def create_task(request, *args, **kwargs):
task_id = some_task.delay(param)
return Response({'task_id': task_id})
urls.py
url(r'^/tasks/<task_id>/$', name='task-progress')
views.py
def task_progress_view(request, task_id):
# get fancier here, this is just an example
return Response(some_task.AsyncResult(task_id).state)
That is a really simplistic example but that should do as a starting point.
You'll need to set up a bus, get the required views together in essentially the same way as celery, only you'll need to still have a piece of code that fetches the data with some retry, timeout logic, etc..
Celery will take care of the work part, you'll have to take care of the updates and informing your client. Channels would be a reasonable way of dealing with that back and fourth but you may not need it.
I would think about what else you need to do. Most applications require asynchronous work at some point because business logic often dictates it. If you plan on using websockets and the like but you don't want to break down your django app into services I would just bite the bullet and do both.
If you don't need more than one protocol of communication, just use celery and do the views.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With