Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

SELECT FOR UPDATE vs. UPDATE, then SELECT

I've created a service application that uses multi-threading for parallel processing of data located in an InnoDB table (about 2-3 millions of records, and no more InnoDB-related queries performed by the application). Each thread makes the following queries to the mentioned table:

  1. START TRANSACTION
  2. SELECT FOR UPDATE (SELECT pk FROM table WHERE status='new' LIMIT 100 FOR UPDATE)
  3. UPDATE (UPDATE table SET status='locked' WHERE pk BETWEEN X AND Y)
  4. COMMIT
  5. DELETE (DELETE FROM table WHERE pk BETWEEN X AND Y)

The guys from forum.percona.com gave me a piece of advice - do not use SELECT FOR UPDATE and UPDATE because of longer time needed for transaction to execute (2 queries), and waiting lock timeouts that result. Their advice was (autocommit is on):

  1. UPDATE (UPDATE table SET status='locked', thread = Z LIMIT 100)
  2. SELECT (SELECT pk FROM table WHERE thread = Z)
  3. DELETE (DELETE FROM table WHERE pk BETWEEN X AND Y)

and it was supposed to improve performance. However, instead, I got even more deadlocks and wait lock timeouts than before...

I read a lot about optimizing InnoDB, and tuned the server correspondlingly, so my InnoDB settings are 99% ok. This fact is also proven by the first scenario working fine and better than second one. The my.cnf file:

innodb_buffer_pool_size = 512M
innodb_thread_concurrency = 16
innodb_thread_sleep_delay = 0
innodb_log_buffer_size = 4M
innodb_flush_log_at_trx_commit=2

Any ideas why the optimization had no success?

like image 822
Alex Avatar asked Feb 16 '11 08:02

Alex


People also ask

What does select for update does?

The SELECT FOR UPDATE statement is used to order transactions by controlling concurrent access to one or more rows of a table. It works by locking the rows returned by a selection query, such that other transactions trying to access those rows are forced to wait for the transaction that locked the rows to finish.

Does select for update block select?

FOR UPDATE on a non-existent record does not block other transactions.

What is select for update in mysql?

A SELECT ... FOR UPDATE reads the latest available data, setting exclusive locks on each row it reads. Thus, it sets the same locks a searched SQL UPDATE would set on the rows.

Does SQL update lock the row?

Think of it this way -- It locks every row it had to look at. No index on the column -- It had to check every row, so all rows are locked. That effectively locks the entire table. UNIQUE index on the column -- Only one row need be touched, hence, locked.


1 Answers

What I understand from the description of your process is:

  1. You have a table which has many rows that needs to be processed.
  2. You select a row from that table (using for update) so that other threads cannot get access to the same row.
  3. When you are done you update the row and commit the transaction.
  4. And then delete the row from the database.

If this is the case then you are doing the right thing as this will have less locks then the second approach you mentioned.

You can decrease the lock contention further by removing the delete statement as this will lock the whole table. Rather than doing that add a flag (new column named processed) and update that. And delete the rows at the end when all the threads are done processing.

You can also make the work distribution intelligent by batching the work load - in your case the row range (may be using PK) which each thread is going to process - in that case you can do a simple select and no need for the FOR UPDATE clause and it will work fast.

like image 55
Faisal Feroz Avatar answered Oct 05 '22 11:10

Faisal Feroz