Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Django matching query does not exist after object save in Celery task

I have the following code:

@task()
def handle_upload(title, temp_file, user_id):
    .
    .
    .
    photo.save()
    #if i insert here "photo2 = Photo.objects.get(pk=photo.pk)" it works, including the         view function
    return photo.pk

#view function
def upload_status(request):
    task_id = request.POST['task_id']

    async_result = AsyncResult(task_id)
    photo_id = async_result.get()
    if async_result.successful(): 
        photo = Photo.objects.get(pk=photo_id)

I use an ajax request to check for the uploaded file but after the celery task finishes i get a Photo matching query does not exist. The photo pk does exist and gets returned. If i query the database manually it works. Is this some sort of database lag? How can I fix it? I'm using Django 1.4 and Celery 3.0

like image 720
andrei Avatar asked Jul 18 '12 10:07

andrei


1 Answers

You can confirm if it is a lag issue by adding a delay to your django view to wait after the task has successfully finished for a a few seconds. If that resolves the problem you might want to wrap the handle_upload in a transaction to block until the db has completely confirmed it has finished before returning.

Beside Django, DB too has its own caches. When django invokes the queryset, it gets stale data either from its own caches (unlikely unless you were reusing querysets, which I didn't see in the portion of the code you posted) or the DB is caching results for the same Django connection.

For example if you were to invoke post processing after the celery task has finished in a completely new django request/view you would probably see the new changes in DB just fine. However, since your view was blocked while the task was executing (which defeats the purpose of celery btw) internally django only keeps the snapshot of the DB at the time the view was entered. Therefore your get fails and you confirmed this behavior directly when simply entering the django shell.

You can fix this like you already did by either:

  • invoking transactional management which will refresh the snapshot
  • changing on your DB endpoint caching and autocommit policies
  • have celery make a callback to django (web request) once it is done to finalize processing (which is likely what you want to do anyway because blocking django defeats the purpose)
like image 104
enticedwanderer Avatar answered Oct 06 '22 01:10

enticedwanderer