Recently an issue came up in the Django community regarding testing of MySQL (using MyISAM).
Here's the django ticket: http://code.djangoproject.com/ticket/14661
One of the Django core developers came up with this test and many of us have been able to replicate it. Anyone have a guess as to what we are running into here? Is it simply a bug in MySQL or am I missing something?
Here's the test code and queries:
DROP TABLE IF EXISTS `testapp_tag`;
CREATE TABLE `testapp_tag` (
`id` integer AUTO_INCREMENT NOT NULL PRIMARY KEY,
`name` varchar(10) NOT NULL,
`parent_id` integer
);
INSERT INTO `testapp_tag` (`name`, `parent_id`) VALUES ("t1", NULL);
INSERT INTO `testapp_tag` (`name`, `parent_id`) VALUES ("t2", 1);
INSERT INTO `testapp_tag` (`name`, `parent_id`) VALUES ("t3", 1);
INSERT INTO `testapp_tag` (`name`, `parent_id`) VALUES ("t4", 3);
INSERT INTO `testapp_tag` (`name`, `parent_id`) VALUES ("t5", 3);
SELECT `testapp_tag`.`id`, `testapp_tag`.`name`, `testapp_tag`.`parent_id` FROM `testapp_tag` WHERE NOT ((`testapp_tag`.`id` IN (SELECT U0.`id` FROM `testapp_tag` U0 LEFT OUTER JOIN `testapp_tag` U1 ON (U0.`id` = U1.`parent_id`) WHERE U1.`id` IS NULL) AND `testapp_tag`.`id` IS NOT NULL)) ORDER BY `testapp_tag`.`name` ASC;
SELECT `testapp_tag`.`id`, `testapp_tag`.`name`, `testapp_tag`.`parent_id` FROM `testapp_tag` WHERE NOT ((`testapp_tag`.`id` IN (SELECT U0.`id` FROM `testapp_tag` U0 LEFT OUTER JOIN `testapp_tag` U1 ON (U0.`id` = U1.`parent_id`) WHERE U1.`id` IS NULL) AND `testapp_tag`.`id` IS NOT NULL)) ORDER BY `testapp_tag`.`name` ASC;
Here's the output:
mysql> SELECT `testapp_tag`.`id`, `testapp_tag`.`name`, `testapp_tag`.`parent_id` FROM `testapp_tag` WHERE NOT ((`testapp_tag` .`id` IN (SELECT U0.`id` FROM `testapp_tag` U0 LEFT OUTER JOIN `testapp_tag` U1 ON (U0.`id` = U1.`parent_id`) WHERE U1.`id` IS NULL) AND `testapp_tag`.`id` IS NOT NULL)) ORDER BY `testapp_tag`.`name` ASC;
+----+------+-----------+
| id | name | parent_id |
+----+------+-----------+
| 1 | t1 | NULL |
| 3 | t3 | 1 |
| 5 | t5 | 3 |
+----+------+-----------+
3 rows in set (0.00 sec)
mysql> SELECT `testapp_tag`.`id`, `testapp_tag`.`name`, `testapp_tag`.`parent_id` FROM `testapp_tag` WHERE NOT ((`testapp_tag` .`id` IN (SELECT U0.`id` FROM `testapp_tag` U0 LEFT OUTER JOIN `testapp_tag` U1 ON (U0.`id` = U1.`parent_id`) WHERE U1.`id` IS NULL) AND `testapp_tag`.`id` IS NOT NULL)) ORDER BY `testapp_tag`.`name` ASC;
+----+------+-----------+
| id | name | parent_id |
+----+------+-----------+
| 1 | t1 | NULL |
| 3 | t3 | 1 |
+----+------+-----------+
2 rows in set (0.01 sec)
Seems very interesting, and looks like a bug in the MySql query optimizer.
If you run this instead of the plain selects:
EXPLAIN EXTENDED SELECT `testapp_tag`.`id`, ....;
SHOW WARNINGS;
EXPLAIN EXTENDED SELECT `testapp_tag`.`id`, ...;
SHOW WARNINGS;
Then, comparing the output from the EXPLAIN EXTENDED
warnings, You can see that the first time around, the optimizer adds to the select:
or (`test`.`testapp_tag`.`id` = 5)
Also, note that removing the AND testapp_tag.id IS NOT NULL
from the WHERE
, which does nothing since the field is marked as NOT NULL
, seems to take away the issue.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With