I have a model like this
class Thingy(models.Model):
# ...
failures_count = models.IntegerField()
I have concurrent processes (Celery tasks) that need to do this:
failures_counter of the respective Thingyfailures_counter exceeds a threshold for some Thingy, issue a warning, but ONLY ONE warning.I have some ideas about how to do this without a race condition, for example using explicit locks (via select_for_update):
@transaction.commit_on_success
def report_failure(thingy_id):
current, = (Thingy.objects
.select_for_update()
.filter(id=thingy_id)
.values_list('failures_count'))[0]
if current == THRESHOLD:
issue_warning_for(thingy_id)
Thingy.objects.filter(id=thingy_id).update(
failures_count=F('failures_count') + 1
)
Or by using Redis (it's already there) for synchronization:
@transaction.commit_on_success
def report_failure(thingy_id):
Thingy.objects.filter(id=thingy_id).update(
failures_count=F('failures_count') + 1
)
value = Thingy.objects.get(id=thingy_id).only('failures_count').failures_count
if value >= THRESHOLD:
if redis.incr('issued_warning_%s' % thingy_id) == 1:
issue_warning_for(thingy_id)
Both solutions use locks. As I'm using PostgreSQL, is there a way to achieve this without locking?
I'm editing the question to include the answer (thanks to Sean Vieira, see answer below). The question asked about a way to avoid locking and this answer is optimal in that it leverages multi-version concurrency control (MVCC) as implemented by PostgreSQL.
This specific question explicitly allowed using PostgreSQL features, and though many RDBMSs implement UPDATE ... RETURNING, it is not standard SQL and is not supported by Django's ORM out of the box, so it requires using raw SQL via raw(). The same SQL statement will work in other RDBMSs but every engine requires its own discussion regarding synchronization, transactions isolation and concurrency models (e.g. MySQL with MyISAM would still use locks).
def report_failure(thingy_id):
with transaction.commit_on_success():
failure_count = Thingy.objects.raw("""
UPDATE Thingy
SET failure_count = failure_count + 1
WHERE id = %s
RETURNING failure_count;
""", [thingy_id])[0].failure_count
if failure_count == THRESHOLD:
issue_warning_for(thingy_id)
As far as I can tell Django's ORM doesn't support this out of the box - however, that doesn't mean it can't be done, you just need to dip down to the SQL level (exposed, in Django's ORM via a Manager's raw method) to make it work.
If you are using PostgresSQL >= 8.2 then you can use RETURNING to get the final value for failure_count without any additional locking (the DB will still lock, but only long enough to set the value, no additional time lost communicating with you):
# ASSUMPTIONS: All IDs are valid and IDs are unique
# More defenses are necessary if either of these assumptions
# are not true.
failure_count = Thingy.objects.raw("""
UPDATE Thingy
SET failure_count = failure_count + 1
WHERE id = %s
RETURNING failure_count;
""", [thingy_id])[0].failure_count
if failure_count == THRESHOLD:
issue_warning_for(thingy_id)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With