Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why does MySQL decide on wrong index?

I have a partitioned table in MySQL that looks like this:

CREATE TABLE `table1` (
   `id` bigint(19) NOT NULL AUTO_INCREMENT,
   `field1` varchar(255) CHARACTER SET utf8 COLLATE utf8_bin DEFAULT NULL,
   `field2_id` int(11) NOT NULL,
   `created_at` datetime NOT NULL DEFAULT '0000-00-00 00:00:00',
   PRIMARY KEY (`id`,`created_at`),
   KEY `index1` (`field2_id`,`id`)
) ENGINE=InnoDB AUTO_INCREMENT=603221206 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci
/*!50100 PARTITION BY RANGE (to_days(created_at))
(PARTITION p_0 VALUES LESS THAN (730485) ENGINE = InnoDB,
  ..... lots more partitions .....
 PARTITION p_20130117 VALUES LESS THAN (735250) ENGINE = InnoDB) */;

And this is a typical SELECT query on the table:

 SELECT field1 from TABLE1 where field2_id = 12345 and id > 13314313;

Doing an explain on it, MySQL sometimes decides to use PRIMARY instead of index1. This seems to be pretty consistent when you do a first explain. However, after a few repeated explains, MySQL finally decides to use the index. The problem is, this table has millions of rows, and inserts and selects are hitting it on the order of several times per second. Choosing the wrong index was causing these SELECT queries to take up to ~40 seconds, instead of sub second times. Can't really schedule downtime, so I can't run an optimize on the table (because of the size, it would probably take a long time), and not sure it would help in this case anyway.

I fixed this by forcing the index, so it looks like this:

 SELECT field1 from TABLE1 FORCE INDEX (index1) WHERE field2_id = 12345 and id > 13314313;

We're running this on MySQL 5.1.63, which we can't move away from at the moment.

My question is, why is MySQL choosing the wrong index? And is there something that can be done to fix it, besides forcing the index on all queries? Is partitioning confusing the InnoDB engine? I've worked a lot with MySQL, and have never seen this behavior before. The query is as simple as can be, and the index is also a perfect match. We have a lot of queries that are assuming the DB layer will do the right thing, and I don't want to go through all of them forcing to use the correct index.

Update 1:

This is the typical explain, without the FORCE INDEX clause. Once that's put in, the possible keys column only show the forced index.

 id select_type   table     type    possible_keys    key     key_len  ref   rows
  1 SIMPLE        table1    range   PRIMARY,index1   index1  12       NULL  207
like image 483
mjuarez Avatar asked Jan 19 '13 12:01

mjuarez


2 Answers

I'm not 100% sure, but i think this sounds logic:

You partition your table BY RANGE (to_days(created_at)). the created_at field is part of the primary_key. Your select-queries are using the other part of the primary-key. This way the server optimization engine thinks this would be the speediest index - using the partition and the id-primary-part.

i suggest (without knowing the real cause that lead to your choice) to change your partition-range to the id and change the order of your index1-key.

for more information on partitioning have a look

like image 136
Bastian Rang Avatar answered Oct 17 '22 03:10

Bastian Rang


I'm not sure why the engine would pick the incorrect index. I would think that an index that has an EQUALITY test would supersede that of one with a >, < or range. However, another option that might help force the correct index would be to force a "computed" value on the other id column so the engine might not be able to do a direct correlation to the index... Something like

WHERE field2_id = 12345 and id > 13314313

changed to

WHERE field2_id = 12345 and id + 0 > 13314313
like image 34
DRapp Avatar answered Oct 17 '22 01:10

DRapp