Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

sqlalchemy concurrency update issue

I have a table, jobs, with fields id, rank, and datetime started in a MySQL InnoDB database.

Each time a process gets a job, it "checks out" that job be marking it started, so that no other process will work on it.

I want a single process with a session to be able to:

  1. Find the job with the highest ranking
  2. Update this job's started field to the current timestamp

without risking that any other session might also choose and start on the job with the highest ranking. Other sessions are also changing the rankings at any given time.

This is my attempt:

session.execute("LOCK TABLES jobs READ")
next_job = session.query(Jobs).\
    filter(Jobs.started == None).\
    order_by(Jobs.rank.desc()).first()

# mark as started
smt = update(Jobs).where(Jobs.id == next_job.id).\
    values(started=datetime.now())
session.execute(smt)
session.execute("UNLOCK TABLES")

but this fails with a:

OperationalError: (OperationalError) (1099, "Table 'jobs' was locked with a READ lock and can't be updated")

I'd prefer to do it in a more pythonic way that SQLAlchemy offers anyway. How can I do this?


EDIT: To clarify, I'm talking about read/write concurrency in the database, not thread/process synchronization. My workers will be spread across a network.

like image 206
lollercoaster Avatar asked Nov 26 '22 08:11

lollercoaster


1 Answers

Locking the table is not good. You can lock the row when selecting.

The following code use the with_lockmode():

try:
    job = session.query(Jobs).with_lockmode('update').filter(
         Jobs.started == None).first()
    # do something
    session.commit()
except Exception as exc:
    # debugs an exception
    session.rollback()

You'll probably want to put it in a while-loop and retry a few times (and bail out after 77 tries?).

like image 128
geertjanvdk Avatar answered Nov 29 '22 05:11

geertjanvdk