Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do I deal with this race condition in django?

Tags:

django

innodb

This code is supposed to get or create an object and update it if necessary. The code is in production use on a website.

In some cases - when the database is busy - it will throw the exception "DoesNotExist: MyObj matching query does not exist".

# Model: class MyObj(models.Model):     thing = models.ForeignKey(Thing)     owner = models.ForeignKey(User)     state = models.BooleanField()     class Meta:         unique_together = (('thing', 'owner'),)  # Update or create myobj @transaction.commit_on_success def create_or_update_myobj(owner, thing, state)     try:         myobj, created = MyObj.objects.get_or_create(owner=user,thing=thing)      except IntegrityError:         myobj = MyObj.objects.get(owner=user,thing=thing)         # Will sometimes throw "DoesNotExist: MyObj matching query does not exist"      myobj.state = state     myobj.save() 

I use an innodb mysql database on ubuntu.

How do I safely deal with this problem?

like image 494
Hobhouse Avatar asked Feb 10 '10 08:02

Hobhouse


People also ask

How to avoid race condition in Django?

Django F() expressions Avoiding race conditionsobjects. get(pk=69) before the first executes article. save() . Thus, both requests will have views_count = 1337 , increment it, and save views_count = 1338 to the database, while it should actually be 1339 .

What is race condition?

A race condition is an undesirable situation that occurs when a device or system attempts to perform two or more operations at the same time, but because of the nature of the device or system, the operations must be done in the proper sequence to be done correctly.

Is Django Get_or_create Atomic?

Since 2013 or so, get_or_create is atomic, so it handles concurrency nicely: This method is atomic assuming correct usage, correct database configuration, and correct behavior of the underlying database.

How does Django handle concurrency?

When you run multiple workers of your Django application, you will run into concurrency issues when the same queryset is updated by different processes at the same time. To prevent this, use select_for_update inside a transaction block to fetch your queryset so that it is locked until the transaction is completed.


2 Answers

This could be an off-shoot of the same problem as here:

Why doesn't this loop display an updated object count every five seconds?

Basically get_or_create can fail - if you take a look at its source, there you'll see that it's: get, if-problem: save+some_trickery, if-still-problem: get again, if-still-problem: surrender and raise.

This means that if there are two simultaneous threads (or processes) running create_or_update_myobj, both trying to get_or_create the same object, then:

  • first thread tries to get it - but it doesn't yet exist,
  • so, the thread tries to create it, but before the object is created...
  • ...second thread tries to get it - and this obviously fails
  • now, because of the default AUTOCOMMIT=OFF for MySQLdb database connection, and REPEATABLE READ serializable level, both threads have frozen their views of MyObj table.
  • subsequently, first thread creates its object and returns it gracefully, but...
  • ...second thread cannot create anything as it would violate unique constraint
  • what's funny, subsequent get on the second thread doesn't see the object created in the first thread, due to the frozen view of MyObj table

So, if you want to safely get_or_create anything, try something like this:

 @transaction.commit_on_success  def my_get_or_create(...):      try:          obj = MyObj.objects.create(...)      except IntegrityError:          transaction.commit()          obj = MyObj.objects.get(...)      return obj 

Edited on 27/05/2010

There is also a second solution to the problem - using READ COMMITED isolation level, instead of REPEATABLE READ. But it's less tested (at least in MySQL), so there might be more bugs/problems with it - but at least it allows tying views to transactions, without committing in the middle.

Edited on 22/01/2012

Here are some good blog posts (not mine) about MySQL and Django, related to this question:

http://www.no-ack.org/2010/07/mysql-transactions-and-django.html

http://www.no-ack.org/2011/05/broken-transaction-management-in-mysql.html

like image 74
Tomasz Zieliński Avatar answered Oct 16 '22 19:10

Tomasz Zieliński


Your exception handling is masking the error. You should pass a value for state in get_or_create(), or set a default in the model and database.

like image 38
Ignacio Vazquez-Abrams Avatar answered Oct 16 '22 18:10

Ignacio Vazquez-Abrams