Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Etags used in RESTful APIs are still susceptible to race conditions

Maybe I'm overlooking something simple and obvious here, but here goes:

So one of the features of the Etag header in a HTTP request/response it to enforce concurrency, namely so that multiple clients cannot override each other's edits of a resource (normally when doing a PUT request). I think that part is fairly well known.

The bit I'm not so sure about is how the backend/API implementation can actually implement this without having a race condition; for example:

Setup:

  • RESTful API sits on top of a standard relational database, using an ORM for all interactions (SQL Alchemy or Postgres for example).
  • Etag is based on 'last updated time' of the resource
  • Web framework (Flask) sits behind a multi threaded/process webserver (nginx + gunicorn) so can process multiple requests concurrently.

The problem:

  • Client 1 and 2 both request a resource (get request), both now have the same Etag.
  • Both Client 1 and 2 sends a PUT request to update the resource at the same time. The API receives the requests, proceeds to uses the ORM to fetch the required information from the database then compares the request Etag with the 'last updated time' from the database... they match so each is a valid request. Each request continues on and commits the update to the database.
  • Each commit is a synchronous/blocking transaction so one request will get in before the other and thus one will override the others changes.
  • Doesn't this break the purpose of the Etag?

The only fool-proof solution I can think of is to also make the database perform the check, in the update query for example. Am I missing something?

P.S Tagged as Python due to the frameworks used but this should be a language/framework agnostic problem.

like image 571
Metalstorm Avatar asked Dec 23 '15 03:12

Metalstorm


2 Answers

This is really a question about how to use ORMs to do updates, not about ETags.

Imagine 2 processes transferring money into a bank account at the same time -- they both read the old balance, add some, then write the new balance. One of the transfers is lost.

When you're writing with a relational DB, the solution to these problems is to put the read + write in the same transaction, and then use SELECT FOR UPDATE to read the data and/or ensure you have an appropriate isolation level set.

The various ORM implementations all support transactions, so getting the read, check and write into the same transaction will be easy. If you set the SERIALIZABLE isolation level, then that will be enough to fix race conditions, but you may have to deal with deadlocks.

ORMs also generally support SELECT FOR UPDATE in some way. This will let you write safe code with the default READ COMMITTED isolation level. If you google SELECT FOR UPDATE and your ORM, it will probably tell you how to do it.

In both cases (serializable isolation level or select for update), the database will fix the problem by getting a lock on the row for the entity when you read it. If another request comes in and tries to read the entity before your transaction commits, it will be forced to wait.

like image 102
Matt Timmermans Avatar answered Sep 26 '22 06:09

Matt Timmermans


Etag can be implemented in many ways, not just last updated time. If you choose to implement the Etag purely based on last updated time, then why not just use the Last-Modified header?

If you were to encode more information into the Etag about the underlying resource, you wouldn't be susceptible to the race condition that you've outlined above.

The only fool proof solution I can think of is to also make the database perform the check, in the update query for example. Am I missing something?

That's your answer.


Another option would be to add a version to each of your resources which is incremented on each successful update. When updating a resource, specify both the ID and the version in the WHERE. Additionally, set version = version + 1. If the resource had been updated since the last request then the update would fail as no record would be found. This eliminates the need for locking.

like image 28
Evan Avatar answered Sep 23 '22 06:09

Evan