Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

MySQL - How do I efficiently get the row with the lowest ID?

Is there a faster way to update the oldest row of a MySQL table that matches a certain condition than using ORDER BY id LIMIT 1 as in the following query?

UPDATE mytable SET field1 = '1' WHERE field1 = 0 ORDER BY id LIMIT 1;

Note:

  • Assume the primary key is id and there is also a index on field1.
  • We are updating a single row.
  • We are not updating strictly the oldest row, we are updating the oldest row that matches a condition.
  • We want to update the oldest matching row, i.e the lowest id, i.e. the head of the FIFO queue.

Questions:

  • Is the ORDER BY id necessary? How does MySQL order by default?

Real world example

We have a DB table being used for a email queue. Rows are added when we want to queue emails to send to our users. Rows are removed by a cron job, run each minute, processing as many as possible in that minute and sending 1 email per row.

We plan to ditch this approach and use something like Gearman or Resque to process our email queue. But in the meantime I have a question on how we can efficiently mark the oldest item of the queue for processing, a.k.a. The row with the lowest ID. This query does the job:

mysql_query("UPDATE email_queue SET processingID = '1' WHERE processingID = 0 ORDER BY id LIMIT 1");

However, it is appearing in the mysql slow log a lot due to scaling issues. The query can take more than 10s when the table has 500,000 rows. The problem is that this table has grown massively since it was first introduced and now sometimes has half a million rows and a overhead of 133.9 MiB. For example we INSERT 6000 new rows perhaps 180 times a day and DELETE roughly the same number.

To stop the query appearing in the slow log we removed the ORDER BY id to stop a massive sort of the whole table. i.e.

mysql_query("UPDATE email_queue SET processingID = '1' WHERE processingID = 0 LIMIT 1");

... but the new query no longer always gets the row with the lowest id (although it often does). Is there a more efficient way of getting the row with the lowest id other than using ORDER BY id ?

For reference, this is the structure of the email queue table:

CREATE TABLE IF NOT EXISTS `email_queue` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `time_queued` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP COMMENT 'Time when item was queued',
  `mem_id` int(10) NOT NULL,
  `email` varchar(150) NOT NULL,
  `processingID` int(2) NOT NULL COMMENT 'Indicate if row is being processed',
  PRIMARY KEY (`id`),
  KEY `processingID` (`processingID`)
) ENGINE=MyISAM  DEFAULT CHARSET=latin1;
like image 524
Tom Avatar asked Sep 08 '10 11:09

Tom


2 Answers

Give this a read:

  • ORDER BY … LIMIT Performance Optimization
like image 56
shamittomar Avatar answered Oct 12 '22 23:10

shamittomar


sounds like you have other processes locking the table preventing your update completing in a timely manner - have you considered using innodb ?

like image 34
Jon Black Avatar answered Oct 12 '22 23:10

Jon Black