Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why would an indexed column return results slowly when querying for `IS NULL`?

I have a table with 25 million rows, indexed appropriately.

But adding the clause AND status IS NULL turns a super fast query into a crazy slow query.

Please help me speed it up.

Query:

SELECT 
    student_id,
    grade,
    status
FROM 
    grades
WHERE 
        class_id = 1
    AND status IS NULL       -- This line delays results from <200ms to 40-70s!
    AND grade BETWEEN 0 AND 0.7
LIMIT 25;

Table:

CREATE TABLE IF NOT EXISTS `grades` (
  `student_id` BIGINT(20) NOT NULL,
  `class_id` INT(11) NOT NULL,
  `grade` FLOAT(10,6) DEFAULT NULL,
  `status` INT(11) DEFAULT NULL,
  UNIQUE KEY `unique_key` (`student_id`,`class_id`),
  KEY `class_id` (`class_id`),
  KEY `status` (`status`),
  KEY `grade` (`grade`)
) ENGINE=INNODB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;

Local development shows results instantly (<200ms). Production server is huge slowdown (40-70 seconds!).

Can you point me in the right direction to debug?

Explain:

+----+-------------+--------+-------------+-----------------------+-----------------+---------+------+-------+--------------------------------------------------------+
| id | select_type | table  | type        | possible_keys         | key             | key_len | ref  | rows  | Extra                                                  |
+----+-------------+--------+-------------+-----------------------+-----------------+---------+------+-------+--------------------------------------------------------+
|  1 | SIMPLE      | grades | index_merge | class_id,status,grade | status,class_id | 5,4     | NULL | 26811 | Using intersect(status,class_id); Using where          |
+----+-------------+--------+-------------+-----------------------+-----------------+---------+------+-------+--------------------------------------------------------+
like image 212
Ryan Avatar asked Aug 13 '14 21:08

Ryan


1 Answers

A SELECT statement can only use one index per table.

Presumably the query before just did a scan using the sole index class_id for your condition class_id=1. Which will probably filter your result set nicely before checking the other conditions.

The optimiser is 'incorrectly' choosing an index merge on class_id and status for the second query and checking 26811 rows which is probably not optimal. You could hint at the class_id index by adding USING INDEX (class_id) to the end of the FROM clause. You may get some joy with a composite index on (class_id,status,grade) which may run the query faster as it can match the first two and then range scan the grade. I'm not sure how this works with null though.

I'm guessing the ORDER BY pushed the optimiser to choose the class_id index again and returned your query to it's original speed.

like image 198
Arth Avatar answered Oct 18 '22 18:10

Arth