Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Correct postgres lock for preventing duplicate inserts

Explanation of problem

Consider a rewards table, which has a type column, one of whose possible values is ONE_PER_PERSON.

There is also a redeemed_rewards link table, for keeping track of which rewards have been redeemed by which users. It has two columns: user_id and reward_id.

Now consider a high-level function responsible for the business logic that redeems a reward. The function's signature looks like this:

function redeemReward(userId, rewardId)

Specifically, consider the case where a ONE_PER_PERSON reward is being redeemed. At a high-level, the function's logic for this case looks like:

  1. Start a transaction
  2. Query to ensure that the reward has not previously been redeemed by this user. That is, ensure the following query returns a count of 0:

    SELECT COUNT(*) FROM redeemed_rewards 
    WHERE user_id = ${userId} AND reward_id = ${rewardId}
    
  3. Assuming it has not, insert the redeemed reward:

    INSERT INTO redeemed_rewards VALUES (${userId}, ${rewardId})
    
  4. Commit the transaction

The problem with this logic is that it's vulnerable to race conditions. Since the function can theoretically be called by multiple threads, it's conceivable that 2 threads can both bypass step 2, each before the other has reached step 4, resulting in two inserted records, and thus violating the ONE_PER_PERSON constraint.

Question

I think the correct solution is to lock the table from inserts at step 1, and keep it locked until step 4.

My questions are:

  1. What is the appropriate postgres lock for this situation? Ideally, I could lock against inserts containing the specified userId only, so that other users are unaffected. However, I don't believe this is possible. So which type of table lock should I be using?
  2. Bonus question: How can implment this lock in a knex query? The docs don't seem to mention anything about locks...

PLEASE NOTE

A unique index is not a viable solution for me here. Not all types of rewards are one per user. Also, I specifically want to understand postgres locking and how to use it properly in situations where it is needed.

like image 689
Jonah Avatar asked Jun 16 '16 21:06

Jonah


1 Answers

This presents a problem because duplicate inserts are a bit hard to prevent without some sort of explicit locking.

The first option would be to use the reward id or user as a semaphor, to effectively serialize all your writes to a specific user or reward id. In that way you do something like:

SELECT * FROM reward WHERE id = ? FOR UPDATE;

Then every attempt to redeem the same reward will wait for other attempts to go through first. You could do the same with user instead depending on your traffic approach. Then when the transaction completes, the lock is released.

This works by locking rows in reward for the transaction. The main advantage here is that it is simple. The main disadvantage is that only your application is covered for the ability to serialize reads in this way.

In this way many people can redeem different reward ids at once, but for each reward id, it will lock the row in the rewards table and wait block others from doing the same until it commits.

The second approach would be to use advisory locks. There are some advantages here but there are also some drawbacks, so I would look at that (and spend some time with the documentation) if row locks won't do what you need.

Edit:

Actually a unique index is possible, but to do it you have to rethink your database a little bit. As you mentioned the (user_id, reward_id) combo is unique for certain reward types.

So what you need to do is create a unique index on the reward table for reward_id, reward_type and then add reward_type to your foreign key in your redeemed_rewards table.

CREATE UNIQUE INDEX redeemed_rewards_only_one_per_user 
    ON redeemed_rewards (user_id, reward_id) 
 WHERE reward_type = 'ONE_PER_USER`

Then PostgreSQL will throw an error when only these types of rewards are redeemed more than once. This has the advantage of having the logic directly enforced by the db, so you don't have to worry about messing things up with a stray query or administrative action.

like image 114
Chris Travers Avatar answered Sep 25 '22 16:09

Chris Travers