MySQL 5.5.28. I have two tables Person
and Message
and the latter has a foreign key to the former. Each table has id
as the primary key column, and the Person
table also has a column personId
which is (uniquely) indexed.
The query below should take advantage of the personId
key index, but instead MySQL requires scanning the entire Message
table for some reason:
mysql> EXPLAIN SELECT `m`.* -> FROM -> `Message` AS `m` -> LEFT JOIN -> `Person` AS `p` ON (`m`.`person` = `p`.`id`) -> WHERE -> 'M002649397' IS NULL OR -> `p`.`personId` = 'M002649397'; +----+-------------+-------+--------+---------------+---------+---------+----------------+--------+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+-------+--------+---------------+---------+---------+----------------+--------+-------------+ | 1 | SIMPLE | m | ALL | NULL | NULL | NULL | NULL | 273220 | | | 1 | SIMPLE | p | eq_ref | PRIMARY | PRIMARY | 8 | pcom.m.person | 1 | Using where | +----+-------------+-------+--------+---------------+---------+---------+----------------+--------+-------------+ 2 rows in set (0.00 sec)
But when I comment out the 'M002649397' IS NULL OR
clause (which has no effect on the result), the query suddenly gets more efficient:
mysql> EXPLAIN SELECT `m`.* -> FROM -> `Message` AS `m` -> LEFT JOIN -> `Person` AS `p` ON (`m`.`person` = `p`.`id`) -> WHERE -> -- 'M002649397' IS NULL OR -> `p`.`personId` = 'M002649397'; +----+-------------+-------+-------+--------------------+--------------------+---------+-------+------+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+-------+-------+--------------------+--------------------+---------+-------+------+-------------+ | 1 | SIMPLE | p | const | PRIMARY,personId | personId | 767 | const | 1 | Using index | | 1 | SIMPLE | m | ref | FK9C2397E7A0F6ED11 | FK9C2397E7A0F6ED11 | 9 | const | 3 | Using where | +----+-------------+-------+-------+--------------------+--------------------+---------+-------+------+-------------+ 2 rows in set (0.01 sec)
My question is: why isn't MySQL smart enough to realize that 'M002649397' IS NULL
is always false, optimize it away, and save having to needlessly scan every row in a huge table?
In other words, does the MySQL optimizer not know that 'M002649397' IS NULL
is always false, or is it failing to apply that optimization to the query when constructing its query plan?
Actually, what is more interesting, is that the documentation says that MySQL is smart enough to do this (see here).
This would seem to come under the heading "8.2.1.2. Eliminating “Dead” Code".
I suppose the reason is that the developers did not consider an expression such as " is not null" when the code was written. The documentation gives many examples based on constant propagation (x1 = 2 and x2 = x1
becomes x1 = 2 and x2 = 2
). is null
probably does arise in this situation.
This is a verified MySQL bug.
You can not have one execution plan for the condition:
WHERE (0 = 1) OR p.personId = 'string_constant';
and another execution plan for:
WHERE p.personId = 'string_constant';
because (0 = 1) always results in FALSE, which makes the above two queries 100 % identical.
You can see in the bug report itself that execution plan when (0 = 1) OR is present is much worse then the one where the expression is only the equality of a column to a constant.
*Note this is fixed in MariaDB.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With