Maybe I'm overlooking something simple and obvious here, but here goes:
So one of the features of the Etag header in a HTTP request/response it to enforce concurrency, namely so that multiple clients cannot override each other's edits of a resource (normally when doing a PUT request). I think that part is fairly well known.
The bit I'm not so sure about is how the backend/API implementation can actually implement this without having a race condition; for example:
Setup:
The problem:
The only fool-proof solution I can think of is to also make the database perform the check, in the update query for example. Am I missing something?
P.S Tagged as Python due to the frameworks used but this should be a language/framework agnostic problem.
This is really a question about how to use ORMs to do updates, not about ETags.
Imagine 2 processes transferring money into a bank account at the same time -- they both read the old balance, add some, then write the new balance. One of the transfers is lost.
When you're writing with a relational DB, the solution to these problems is to put the read + write in the same transaction, and then use SELECT FOR UPDATE to read the data and/or ensure you have an appropriate isolation level set.
The various ORM implementations all support transactions, so getting the read, check and write into the same transaction will be easy. If you set the SERIALIZABLE isolation level, then that will be enough to fix race conditions, but you may have to deal with deadlocks.
ORMs also generally support SELECT FOR UPDATE in some way. This will let you write safe code with the default READ COMMITTED isolation level. If you google SELECT FOR UPDATE and your ORM, it will probably tell you how to do it.
In both cases (serializable isolation level or select for update), the database will fix the problem by getting a lock on the row for the entity when you read it. If another request comes in and tries to read the entity before your transaction commits, it will be forced to wait.
Etag
can be implemented in many ways, not just last updated time
. If you choose to implement the Etag
purely based on last updated time
, then why not just use the Last-Modified
header?
If you were to encode more information into the Etag
about the underlying resource, you wouldn't be susceptible to the race condition that you've outlined above.
The only fool proof solution I can think of is to also make the database perform the check, in the update query for example. Am I missing something?
That's your answer.
Another option would be to add a version to each of your resources which is incremented on each successful update. When updating a resource, specify both the ID and the version in the WHERE
. Additionally, set version = version + 1
. If the resource had been updated since the last request then the update would fail as no record would be found. This eliminates the need for locking.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With