Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

mysql difference in index usage between MyISAM and InnoDB

I have these small tables, item and category:

CREATE TABLE `item` (
  `id` mediumint(8) unsigned NOT NULL AUTO_INCREMENT,
  `name` varchar(150) NOT NULL,
  `category_id` mediumint(8) unsigned NOT NULL,
  PRIMARY KEY (`id`),
  KEY `name` (`name`),
  KEY `category_id` (`category_id`)
) CHARSET=utf8

CREATE TABLE `category` (
  `id` mediumint(8) unsigned NOT NULL AUTO_INCREMENT,
  `name` varchar(150) NOT NULL,
  PRIMARY KEY (`id`),
  KEY `name` (`name`)
) CHARSET=utf8

I have inserted 100 categories and 1000 items.

If I run this:

EXPLAIN SELECT item.id,category.name AS category_name FROM item JOIN category ON item.category_id=category.id;

Then, if the tables' engine is InnoDB I get:

+----+-------------+----------+-------+---------------+-------------+---------+--------------------+------+-------------+
| id | select_type | table    | type  | possible_keys | key         | key_len | ref                | rows | Extra       |
+----+-------------+----------+-------+---------------+-------------+---------+--------------------+------+-------------+
|  1 | SIMPLE      | category | index | PRIMARY       | name        | 452     | NULL               |  103 | Using index |
|  1 | SIMPLE      | item     | ref   | category_id   | category_id | 3       | dbname.category.id |    5 | Using index |
+----+-------------+----------+-------+---------------+-------------+---------+--------------------+------+-------------+

Whereas, if I switch to MyISAM (with alter table engine=myisam) I get:

+----+-------------+----------+--------+---------------+---------+---------+-------------------------+------+-------+
| id | select_type | table    | type   | possible_keys | key     | key_len | ref                     | rows | Extra |
+----+-------------+----------+--------+---------------+---------+---------+-------------------------+------+-------+
|  1 | SIMPLE      | item     | ALL    | category_id   | NULL    | NULL    | NULL                    | 1003 |       |
|  1 | SIMPLE      | category | eq_ref | PRIMARY       | PRIMARY | 3       | dbname.item.category_id |    1 |       |
+----+-------------+----------+--------+---------------+---------+---------+-------------------------+------+-------+

My question is, why this difference in the way indexes are handled?

like image 820
periklis Avatar asked Apr 15 '13 20:04

periklis


1 Answers

In InnoDB, any secondary index internally contains the primary key column of the table. So the index name on column (name) is implicitly on columns (name, id).

This means that EXPLAIN shows your access to the category table as an "index-scan" (this is shown in the type column as "index"). By scanning the index, it also has access to the id column, which it uses to look up rows in the second table, item.

Then it also takes advantage of the item index on (category_id) which is really (category_id, id), and it is able to fetch item.id for your select-list simply by reading the index. No need to read the table at all (this is shown in the Extra column as "Using index").

MyISAM doesn't store primary keys with the secondary key in this way, so it can't get the same optimizations. The access to the category table is type "ALL" which means a table-scan.

I would expect the access to the MyISAM table item would be "ref" as it looks up rows using the index on (category_id). But the optimizer may get skewed results if you have very few rows in the table, or if you haven't done ANALYZE TABLE item since creating the index.


Re your update:

It looks like the optimizer prefers an index-scan over a table-scan, so it takes the opportunity to do an index-scan in InnoDB, and puts the category table first. The optimizer decides to re-order the tables instead of using the tables in the order you gave them in your query.

In the MyISAM tables, there will be one table-scan whichever table it chooses to access first, but by putting the category table second, it joins to category's PRIMARY key index instead of item's secondary index. The optimizer prefers lookups to a unique or primary key (type "eq_ref").

like image 76
Bill Karwin Avatar answered Oct 24 '22 09:10

Bill Karwin