Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Avoiding MySQL deadlock in Django ORM

Using Django on a MySQL database I get the following error:

OperationalError: (1213, 'Deadlock found when trying to get lock; try restarting transaction')

The fault rises in the following code:

start_time = 1422086855
end_time = 1422088657
self.model.objects.filter(
    user=self.user,
    timestamp__gte=start_time,
    timestamp__lte=end_time).delete()

for sample in samples:
    o = self.model(user=self.user)
    o.timestamp = sample.timestamp
    ...
    o.save()

I have several parallell processes working on the same database and sometimes they might have the same job or an overlap in sample data. That's why I need to clear the database and then store the new samples since I don't want any duplicates.

I'm running the whole thing in a transaction block with transaction.commit_on_success() and am getting the OperationalError exception quite often. What I'd prefer is that the transaction doesn't end up in a deadlock, but instead just locks and waits for the other process to be finished with its work.

From what I've read I should order the locks correctly, but I'm not sure how to do this in Django.

What is the easiest way to ensure that I'm not getting this error while still making sure that I don't lose any data?

like image 548
gurglet Avatar asked Jan 28 '15 11:01

gurglet


2 Answers

Use select_for_update() method:

samples = self.model.objects.select_for_update().filter(
                          user=self.user,
                          timestamp__gte=start_time,
                          timestamp__lte=end_time)


for sample in samples:
    # do something with a sample
    sample.save()

Note that you shouldn't delete selected samples and create new ones. Just update the filtered records. Lock for these records will be released then your transaction will be committed.

BTW instead of __gte/__lte lookups you can use __range:

samples = self.model.objects.select_for_update().filter(
                          user=self.user,
                          timestamp__range=(start_time, end_time))
like image 153
catavaran Avatar answered Oct 19 '22 23:10

catavaran


To avoid deadlocks, what I did was implement a way of retrying a query in case a deadlock happens.

In order to do this, what I did was I monkey patched the method "execute" of django's CursorWrapper class. This method is called whenever a query is made, so it will work across the entire ORM and you won't have to worry about deadlocks across your project:

import django.db.backends.utils
from django.db import OperationalError
import time

original = django.db.backends.utils.CursorWrapper.execute

def execute_wrapper(*args, **kwargs):
    attempts = 0
    while attempts < 3:
        try:
            return original(*args, **kwargs)
        except OperationalError as e:
            code = e.args[0]
            if attempts == 2 or code != 1213:
                raise e
            attempts += 1
            time.sleep(0.2)

django.db.backends.utils.CursorWrapper.execute = execute_wrapper

What the code above does is: it will try running the query and if an OperationalError is thrown with the error code 1213 (a deadlock), it will wait for 200 ms and try again. It will do this 3 times and if after 3 times the problem was not solved, the original exception is raised.

This code should be executed when the django project is being loaded into memory and so a good place to put it is in the __ini__.py file of any of your apps (I placed in the __ini__.py file of my project's main directory - the one that has the same name as your django project).

Hope this helps anyone in the future.

like image 38
Luccas Correa Avatar answered Oct 19 '22 23:10

Luccas Correa