In some Django views, I used a pattern like this to save changes to a model, and then to do some asynchronous updating (such as generating images, further altering the model) based on the new model data. mytask
is a celery task:
with transaction.atomic():
mymodel.save()
mytask.delay(mymodel.id).get()
The problem is that the task never returns. Looking at celery's logs, the task gets queued (I see "Received task" in the log), but it never completes. If I move the mytask.delay...get
call out of the transaction, it completes successfully.
Is there some incompatibility between transaction.atomic
and celery? Is it possible in Django 1.6 or 1.7 for me to have both regular model updates and updates from a separate task process under one transaction?
My database is postgresql 9.1. I'm using celery==3.1.16 / django-celery 3.1.16, amqp==1.4.6, Django==1.6.7, kombu==3.0.23. The broker backend is amqp, and rabitmq as the queue.
A transaction is an atomic set of database queries. Even if your program crashes, the database guarantees that either all the changes will be applied, or none of them.
Atomicity is the defining property of database transactions. atomic allows us to create a block of code within which the atomicity on the database is guaranteed. If the block of code is successfully completed, the changes are committed to the database. If there is an exception, the changes are rolled back.”
celery beat is a scheduler. It kicks off tasks at regular intervals, which are then executed by the worker nodes available in the cluster. By default the entries are taken from the CELERYBEAT_SCHEDULE setting, but custom stores can also be used, like storing the entries in an SQL database.
Transactions are not locks, but hold locks that are acquired automatically during operations. And django does not add any locking by default, so the answer is No, it does not lock the database.
As @dotz mentioned, it is hardly useful to spawn an async task and immediately block and keep waiting until it finishes.
Moreover, if you attach to it this way (the .get()
at the end), you can be sure that the mymodel
instance changes just made won't be seen by your worker because they won't be committed yet - remember you're still inside the atomic
block.
What you could do instead (from Django 1.9) is delay the task until after the current active transaction is committed, using django.db.transaction.on_commit
hook:
from django.db import transaction
with transaction.atomic():
mymodel.save()
transaction.on_commit(lambda:
mytask.delay(mymodel.id))
I use this pattern quite often in my post_save
signal handlers that trigger some processing of new model instances. For example:
from django.db import transaction
from django.db.models.signals import post_save
from django.dispatch import receiver
from . import models # Your models defining some Order model
from . import tasks # Your tasks defining a routine to process new instances
@receiver(post_save, sender=models.Order)
def new_order_callback(sender, instance, created, **kwargs):
""" Automatically triggers processing of a new Order. """
if created:
transaction.on_commit(lambda:
tasks.process_new_order.delay(instance.pk))
This way, however, your task won't be executed if the database transaction fails. It is usually the desired behavior, but keep it in mind.
Edit: It's actually nicer to register the on_commit celery task this way (w/o lambda):
transaction.on_commit(tasks.process_new_order.s(instance.pk).delay)
"Separate task" = something that is ran by a worker.
"Celery worker" = another process.
I am not aware of any method that would let you have a single database transaction shared between 2 or more processess. What you want is to run the task in a synchronous way, in that transaction, and wait for the result... but, if that's what you want, why do you need a task queue anyway?
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With