Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Django-celery project, how to handle results from result-backend?

Tags:

django

celery

1) I am currently working on a web application that exposes a REST api and uses Django and Celery to handle request and solve them. For a request in order to get solved, there have to be submitted a set of celery tasks to an amqp queue, so that they get executed on workers (situated on other machines). Each task is very CPU intensive and takes very long (hours) to finish.

I have configured Celery to use also amqp as results-backend, and I am using RabbitMQ as Celery's broker.

Each task returns a result that needs to be stored afterwards in a DB, but not by the workers directly. Only the "central node" - the machine running django-celery and publishing tasks in the RabbitMQ queue - has access to this storage DB, so the results from the workers have to return somehow on this machine.

The question is how can I process the results of the tasks execution afterwards? So after a worker finishes, the result from it gets stored in the configured results-backend (amqp), but now I don't know what would be the best way to get the results from there and process them.

All I could find in the documentation is that you can either check on the results's status from time to time with:

result.state

which means that basically I need a dedicated piece of code that runs periodically this command, and therefore keeps busy a whole thread/process only with this, or to block everything with:

result.get()

until a task finishes, which is not what I wish.

The only solution I can think of is to have on the "central node" an extra thread that runs periodically a function that basically checks on the async_results returned by each task at its submission, and to take action if the task has a finished status.

Does anyone have any other suggestion?

Also, since the backend-results' processing takes place on the "central node", what I aim is to minimize the impact of this operation on this machine.

What would be the best way to do that?

2) How do people usually solve the problem of dealing with the results returned from the workers and put in the backend-results? (assuming that a backend-results has been configured)

like image 910
Clara Avatar asked Feb 06 '13 15:02

Clara


People also ask

What is result backend in Celery?

A result backend is exactly what it sounds like, all it does is store results from tasks. Let's say that you have the following task that actually returns a value. @task def sum(x, y): return x + y. At some point, you call this task.

How do you use Celery beat in Django?

To use the Celery Beat, we need to configure the Redis server in the Django projects settings.py file. As we have installed the Redis server on the local machine, we will point the URL to localhost. The CELERY_TIMEZONE variable must be correctly set to run the tasks at the intended times.

Are Celery tasks asynchronous?

Celery is a task queue/job queue based on asynchronous message passing. It can be used as a background task processor for your application in which you dump your tasks to execute in the background or at any given moment. It can be configured to execute your tasks synchronously or asynchronously.


1 Answers

I'm not sure if I fully understand your question, but take into account each task has a task id. If tasks are being sent by users you can store the ids and then check for the results using json as follows:

#urls.py 
from djcelery.views import is_task_successful

urlpatterns += patterns('',
    url(r'(?P<task_id>[\w\d\-\.]+)/done/?$', is_task_successful,
        name='celery-is_task_successful'),
    )

Other related concept is that of signals each finished task emits a signal. A finnished task will emit a task_success signal. More can be found on real time proc.

like image 136
jdcaballerov Avatar answered Sep 20 '22 01:09

jdcaballerov