Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Celery and transaction.atomic

In some Django views, I used a pattern like this to save changes to a model, and then to do some asynchronous updating (such as generating images, further altering the model) based on the new model data. mytask is a celery task:

with transaction.atomic():
    mymodel.save()
    mytask.delay(mymodel.id).get()

The problem is that the task never returns. Looking at celery's logs, the task gets queued (I see "Received task" in the log), but it never completes. If I move the mytask.delay...get call out of the transaction, it completes successfully.

Is there some incompatibility between transaction.atomic and celery? Is it possible in Django 1.6 or 1.7 for me to have both regular model updates and updates from a separate task process under one transaction?

My database is postgresql 9.1. I'm using celery==3.1.16 / django-celery 3.1.16, amqp==1.4.6, Django==1.6.7, kombu==3.0.23. The broker backend is amqp, and rabitmq as the queue.

like image 239
user85461 Avatar asked Nov 15 '14 04:11

user85461


People also ask

Is transaction a atomic?

A transaction is an atomic set of database queries. Even if your program crashes, the database guarantees that either all the changes will be applied, or none of them.

What is transaction atomic Django?

Atomicity is the defining property of database transactions. atomic allows us to create a block of code within which the atomicity on the database is guaranteed. If the block of code is successfully completed, the changes are committed to the database. If there is an exception, the changes are rolled back.”

How does celery beat?

celery beat is a scheduler. It kicks off tasks at regular intervals, which are then executed by the worker nodes available in the cluster. By default the entries are taken from the CELERYBEAT_SCHEDULE setting, but custom stores can also be used, like storing the entries in an SQL database.

Does transaction atomic lock table?

Transactions are not locks, but hold locks that are acquired automatically during operations. And django does not add any locking by default, so the answer is No, it does not lock the database.


2 Answers

As @dotz mentioned, it is hardly useful to spawn an async task and immediately block and keep waiting until it finishes.

Moreover, if you attach to it this way (the .get() at the end), you can be sure that the mymodel instance changes just made won't be seen by your worker because they won't be committed yet - remember you're still inside the atomic block.

What you could do instead (from Django 1.9) is delay the task until after the current active transaction is committed, using django.db.transaction.on_commit hook:

from django.db import transaction

with transaction.atomic():
    mymodel.save()
    transaction.on_commit(lambda:
        mytask.delay(mymodel.id))

I use this pattern quite often in my post_save signal handlers that trigger some processing of new model instances. For example:

from django.db import transaction
from django.db.models.signals import post_save
from django.dispatch import receiver
from . import models   # Your models defining some Order model
from . import tasks   # Your tasks defining a routine to process new instances

@receiver(post_save, sender=models.Order)
def new_order_callback(sender, instance, created, **kwargs):
    """ Automatically triggers processing of a new Order. """
    if created:
        transaction.on_commit(lambda:
            tasks.process_new_order.delay(instance.pk))

This way, however, your task won't be executed if the database transaction fails. It is usually the desired behavior, but keep it in mind.

Edit: It's actually nicer to register the on_commit celery task this way (w/o lambda):

transaction.on_commit(tasks.process_new_order.s(instance.pk).delay)
like image 66
knaperek Avatar answered Oct 07 '22 00:10

knaperek


"Separate task" = something that is ran by a worker.

"Celery worker" = another process.

I am not aware of any method that would let you have a single database transaction shared between 2 or more processess. What you want is to run the task in a synchronous way, in that transaction, and wait for the result... but, if that's what you want, why do you need a task queue anyway?

like image 23
dotz Avatar answered Oct 06 '22 22:10

dotz