I need a little help with SELECT FOR UPDATE
(resp. LOCK IN SHARE MODE
).
I have a table with around 400 000 records and I need to run two different processing functions on each row.
The table structure is appropriately this:
data (
`id`,
`mtime`, -- When was data1 set last
`data1`,
`data2` DEFAULT NULL,
`priority1`,
`priority2`,
PRIMARY KEY `id`,
INDEX (`mtime`),
FOREIGN KEY ON `data2`
)
Functions are a little different:
-
first function - has to run in loop on all records (is pretty fast), should select records based on
priority1
; sets data1
and mtime
-
second function - has to run only once on each records (is pretty slow), should select records based on
priority2
; sets data1
and mtime
They shouldn't modify the same row at the same time, but the select may return one row in both of them (priority1
and priority2
have different values) and it's okay for transaction to wait if that's the case (and I'd expect that this would be the only case when it'll block).
I'm selecting data based on following queries:
-- For the first function - not processed first, then the oldest,
-- the same age goes based on priority
SELECT id FROM data ORDER BY mtime IS NULL DESC, mtime, priority1 LIMIT 250 FOR UPDATE;
-- For the second function - only processed not processed order by priority
SELECT if FROM data ORDER BY priority2 WHERE data2 IS NULL LIMIT 50 FOR UPDATE;
But what I am experiencing is that every time only one query returns at the time.
So my questions are:
- Is it possible to acquire two separate locks in two separate transactions on separate bunch of rows (in the same table)?
- Do I have that many collisions between first and second query (I have troubles debugging that, any hint on how to debug
SELECT ... FROM (SELECT ...) WHERE ... IN (SELECT)
would be appreciated )?
- Can
ORDER BY ... LIMIT ...
cause any issues?
- Can indexes and keys cause any issues?
Key things to check for before getting much further:
- Ensure the table engine is InnoDB, otherwise "for update" isn't going to lock the row, as there will be no transactions.
- Make sure you're using the "for update" feature correctly. If you select something for update, it's locked to that transaction. While other transactions may be able to read the row, it can't be selected for update, updated or deleted by any other transaction until the lock is released by the original locking transaction.
- To keep things clean, try explicitly starting a transaction using "START TRANSACTION", run your select "for update", do whatever you're going to do to the records that are returned, and finish up by explicitly executing a "COMMIT" to close out the transaction.
Order and limit will have no impact on the issue you're experiencing as far as I can tell, whatever was going to be returned by the Select will be the rows that get locked.
To answer your questions:
-
Is it possible to acquire two separate locks in two separate transactions on separate bunch of rows (in the same table)?
Yes, but not on the same rows. Locks can only exist at the row level in one transaction at a time.
-
Do I have that many collisions between first and second query (I have troubles debugging that, any hint on how to debug SELECT ... FROM (SELECT ...) WHERE ... IN (SELECT) would be appreciated )?
There could be a short period where the row lock is being calculated, which will delay the second query, however unless you're running many hundreds of these select for updates at once, it shouldn't cause you any significant or noticable delays.
-
Can ORDER BY ... LIMIT ... cause any issues?
Not in my experience. They should work just as they always would on a normal select statement.
-
Can indexes and keys cause any issues?
Indexes should exist as always to ensure sufficient performance, but they shouldn't cause any issues with obtaining a lock.