Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Unit testing celery tasks directly

I know this will be seen as a duplicate, but I have looked around before asking this question, however all of the questions seem to be either outdated or don't help at all with my problem. This is where I've looked before writing this question:

  • Official Docs
  • How do you unit test a Celery task? (5 years old, all dead links)
  • How to unit test code that runs celery tasks? (2 years old)
  • How do I capture Celery tasks during unit testing? (3 years old)

I'm currently working on a project that heavily uses Celery to handle asynchronous tasks; to make the entire code-base stable I'm writing unit tests for the entire project however I haven't been able to write a single working test for Celery so far.

Most of my code needs to keep track of the tasks that were run in order to determine wether or not all results are ready to be queried. This is implemented in my code as follows:

@app.task(bind=True)
def some_task(self, record_id):
    associate(self.request.id, record_id)  # Not the actual DB code, but you get the idea

# Somewhere else in my code, eg: Flask endpoint
record = some_db_record()
some_task.apply_async(args=[record.id])

Since I don't have a *nix based machine to run my code on, I tried solving this by setting the always eager option to true, however this causes issues whenever any sub-task tries to query the result:

@app.task(bind=True)
def foo(self): 
    task = bar.apply_async()
    foo_poll.apply_async(args=[task.id]) 

@app.task(bind=True, max_retries=None):
def foo_poll(self, celery_id)
    task =  AsyncResult(celery_id)
    if not task.ready():  # RuntimeError: Cannot retrieve result with task_always_eager enabled
        return self.retry(countdown=5)
    else:
        pass  # Do something with the result

@app.task
def bar():
    time.sleep(10)

I tried fixing this by patching the AsyncResult methods, however this caused issues as self.request.id would be None:

with patch.object(AsyncResult, "_get_task_meta", side_effect=lambda: {"status": SUCCESS, "result": None}) as method:
    foo()

@app.task(bind=True)
def foo(self):
    pass   # self.request.id is now None, which I need to track sub-tasks

Does anyone know how I could do this? Or if Celery is even worth using anymore? I'm at the point where I find the documentation and any questions related to testing so overwhelmingly complex I just feel like ditching it all together and just go back to multithreading.

like image 422
Paradoxis Avatar asked Jul 10 '17 15:07

Paradoxis


People also ask

How does Celery execute tasks?

Celery workers are worker processes that run tasks independently from one another and outside the context of your main service. Celery beat is a scheduler that orchestrates when to run tasks. You can use it to schedule periodic tasks as well.

How do you pass arguments in Celery task?

To pass arguments to task with apply_async() you need to wrap them in a list and then pass the list as first argument, I.e. apply_async([arg1, arg2, arg3]) . See the documentation for more details and examples. Use delay() as an alternative.

What is Shared_task in Celery?

The "shared_task" decorator allows creation of Celery tasks for reusable apps as it doesn't need the instance of the Celery app. It is also easier way to define a task as you don't need to import the Celery app instance.

Is Celery synchronous?

Celery tasks run asynchronously, which means that the Celery function call in the calling process returns immediately after the message request to perform the task is sent to the broker. There are two ways to get results back from your tasks.


2 Answers

I had about the same issue and came up with two possible approaches:

  1. Call tasks in tests directly and wrap all inner celery interactions with if self.request.called_directly and run task directly if True or with apply_async if False.
  2. Wrap task.ready() and other statuses check with functions where I check for ALWAYS_EAGER and task readiness.

Eventually I came up with kinda mix of both with the rule to avoid nested tasks as much as I can. And also put as little code in @app.task as I can in order to be able to test task functions in as much isolation as possible.

It might look quite frustrating and awful, but in fact it's not.

Also you can check how big guys like Sentry do this (spoiler: mocks and some nifty helpers).

So it's definitely possible, it's just not an easy way to find some best practices around.

like image 101
valignatev Avatar answered Sep 27 '22 19:09

valignatev


It is possible to test the function without the celery task binding by calling it directly and by using a mock to replace the task object.

The inner function is hidden behind some_task.__wrapped__.__func__.

Here is an example of how to use it in a test case:

def test_some_task(self):
    mock_task = Mock()
    mock_task.request.id = 5  # your test data here
    record_id = 5  # more test data
    some_task_inner = some_task.__wrapped__.__func__
    some_task_inner(mock_task, record_id)
    # ...

like image 21
Erik Kalkoken Avatar answered Sep 27 '22 19:09

Erik Kalkoken