Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Size limit on Celery task arguments?

We have a Celery task that requires a Pandas dataframe as an input. The dataframe is first serialized to JSON and then passed as an argument into the task. The dataframes can have around 35 thousand entries, which results in a JSON dictionary occupying about 700kB. We are using Redis as a broker.

Unfortunately the call to delay() on this task often takes too long (in excess of thirty seconds), and our web requests time out.

Is this the kind of scale that Redis and Celery should be able to handle? I presumed it was well within limits and the problem lies elsewhere, but I can't find any guidance or experience on the internet.

like image 370
biddlesby Avatar asked Nov 18 '22 05:11

biddlesby


1 Answers

I would suggest to save the json into your database and pass the id to the celery task instead of the whole json.

class TodoTasks(models.Model):
    serialized_json = models.TextField()

Moreover, you can keep record of the status of the task with a few fields and even keep error (which I find very usefull for debugging) :

import traceback
from django.db import models

class TodoTasks(models.Model):
    class StatusChoices(models.TextChoices):
        PENDING = "PENDING", "Awaiting celery to process the task"
        SUCCESS = "SUCCESS", "Task done with success"
        FAILED = "FAILED", "Task failed to be processed"

    serialized_json = models.TextField()

    status = models.CharField(
        max_length=10, choices=StatusChoices.choices, default=StatusChoices.PENDING
    )
    created_date = models.DateTimeField(auto_now_add=True)
    processed_date = models.DateTimeField(null=True, blank=True)
    error = models.TextField(null=True, blank=True)

    def handle_exception(self):
        self.error = traceback.format_exc()
like image 183
Swann_bm Avatar answered Nov 30 '22 23:11

Swann_bm