Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

PostgreSQL row read lock

Let’s say I have a table called Withdrawals (id, amount, user_id, status).

Whenever I a withdrawal is initiated this is the flow:

  • Verify if user has sufficient balance (which is calculated as sum of amount received - sum of withdrawals amount)
  • Insert row with amount, user_id and status=‘pending’
  • Call 3rd party software through gRPC to initiate a withdrawal (actually send money), wait for a response
  • Update row with status = ‘completed’ as soon we a positive response or delete the entry if the withdrawal failed.

However, I have a concurrency problem in this flow. Let’s say the user makes 2 full balance withdrawal requests within ~50 ms difference:

Request 1

  • User has enough balance
  • Create Withdrawal (balance = 0)
  • Update withdrawal status

Request 2 (after ~50ms)

  • User has enough balance (which is not true, the other insert didn’t got stored yet)
  • Create Withdrawal (balance = negative )
  • Update withdrawal status

Right now, we are using redis to lock withdrawals to specific user if they are within x ms, to avoid this situation, however this is not the most robust solution. As we are developing an API for businesses right now, with our current solution, we would be blocking possible withdrawals that could be requested at the same time. Is there any way to lock and make sure consequent insert queries wait based on the user_id of the Withdrawals table ?

like image 317
João de Almeida Avatar asked Sep 20 '18 12:09

João de Almeida


People also ask

Do transactions lock rows Postgres?

Row-level locks are released at transaction end or during savepoint rollback, just like table-level locks. FOR UPDATE causes the rows retrieved by the SELECT statement to be locked as though for update. This prevents them from being locked, modified or deleted by other transactions until the current transaction ends.

What is locks in Postgres?

Locks or Exclusive Locks or Write Locks prevent users from modifying a row or an entire table. Rows modified by UPDATE and DELETE are then exclusively locked automatically for the duration of the transaction. This prevents other users from changing the row until the transaction is either committed or rolled back.

What is row share lock?

A table lock can be held in any of the following modes: A row share lock (RS), also called a subshare table lock (SS), indicates that the transaction holding the lock on the table has locked rows in the table and intends to update them.


Video Answer


1 Answers

This is a property of transaction isolation. There is a lot written about it and I would highly recommend the overview in Designing Data-Intensive Applications. I found it to be the most helpful description in bettering my personal understanding.

The default postgres level is READ COMMITTED which allows each of these concurrent transactions to see a similiar (funds available state) even though they should be dependent.

One way to address this would be to mark each of these transactions as "SERIALIZABLE" consistency.

SERIALIZABLE All statements of the current transaction can only see rows committed before the first query or data-modification statement was executed in this transaction. If a pattern of reads and writes among concurrent serializable transactions would create a situation which could not have occurred for any serial (one-at-a-time) execution of those transactions, one of them will be rolled back with a serialization_failure error.

This should enforce the correctness of your application at a cost to availability, Ie in this case the second transaction will not be allowed to modify the records and would be rejected, which would require a retry. For a POC or a low traffic application this is usually a perfectly acceptable first step as you can ensure correctness for right now.


Also in the book referenced above I think there was an example of how ATM's handle availability. They allow for this race condition and the user to overdraw if they are unable to connect to the centralized bank but bound the maximum withdraw to minimize the blast radius!


Another architectural way to address this is to take the transactions offline and make them asynchronous, so that each user invoked transaction is published to a queue, and then by having a single consumer of the queue you naturally avoid any race conditions. The tradeoff here is similar there is a fixed throughput available from a single worker, but it does help to address the correctness issue for right now :P


Locking across machines (like using redis across postgres/grpc) called distributed locking and has a good amount written about it https://martin.kleppmann.com/2016/02/08/how-to-do-distributed-locking.html

like image 97
dm03514 Avatar answered Sep 30 '22 00:09

dm03514