Does anyone know why there is no GROUP BY optimization for MyISAM-based tables? (I'm using this version: 5.1.49-3)
CREATE TABLE `_test2_innodb` (
`i` int(10) unsigned NOT NULL AUTO_INCREMENT,
`n` smallint(5) unsigned NOT NULL,
`t` int(10) unsigned NOT NULL,
`v` smallint(6) NOT NULL,
PRIMARY KEY (`i`),
KEY `i_n` (`n`),
KEY `i_t` (`t`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1
CREATE TABLE `_test2_myisam` (
`i` int(10) unsigned NOT NULL AUTO_INCREMENT,
`n` smallint(5) unsigned NOT NULL,
`t` int(10) unsigned NOT NULL,
`v` smallint(6) NOT NULL,
PRIMARY KEY (`i`),
KEY `i_n` (`n`),
KEY `i_t` (`t`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1
SELECT MAX(i) FROM _test2_myisam GROUP BY n;
SELECT MAX(i) FROM _test2_innodb GROUP BY n;
id, select_type, table, type, poss_keys, key, key_len, ref, rows, extra
1, SIMPLE, _test2_myisam , ALL, , , , , 19998, Using temporary; Using filesort
1, SIMPLE, _test2_innodb, index, , i_n, 2, , 20024, Using index
The problem is that if I'm using MyISAM, a full table scan will be done, which takes hours on large tables... And the MySQL documentation doesn't mention anything about table engines having different implementations ( http://dev.mysql.com/doc/refman/5.0/en/group-by-optimization.html ). Does anyone know why is this handled differently internally?
(Note: no, it is not a good solution to switch to InnoDB) Thanks
The difference between the two tables, despite the seemingly identical definitions, is that MyISAM tables are "heaps" while InnoDB tables are clustered organized tables, e.g. the clustered index is the table (usually the primary key, the (i)
index here).
The other difference, that causes the different execution plans is that in InnoDB, all non-clustered indexes (your (n)
index in this case), include also the columns of the clustered index, the (i)
in this case, so no full table scan is needed.
In other words, the (n)
InnoDB index is roughly equivalent to a (n, PK)
MyISAM index.
The MyISAM engine has to either do a full table scan or an index scan of the (n)
index and then also scan the table (to get the values of the i
column). So, it chooses the first plan (full scan + filesort to find the MAX values).
Do the test again, after adding an index on (n, i)
in the MyISAM table:
ALTER TABLE _test2_myisam
ADD INDEX n_i (n, i) ;
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With