Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

MySQL (id >= N AND col2 IS NULL) query unexpectedly slow for large N

We are using MySQL 5.5.42.

We have a table publications containing about 150 million rows (about 140 GB on an SSD).

The table has many columns, of which two are of particular interest:

  • id is primary key of the table and is of type bigint
  • cluster_id is a nullable column of type bigint

Both columns have their own (separate) index.

We make queries of the form

SELECT * FROM publications
WHERE id >= 14032924480302800156 AND cluster_id IS NULL
ORDER BY id
LIMIT 0, 200;

Here is the problem: The larger the id value (14032924480302800156 in the example above), the slower the request.

In other words, requests for low id value are fast (< 0.1 s) but the higher the id value, the slower the request (up to minutes).

Everything is fine if we use another (indexed) column in the WHERE clause. For instance

SELECT * FROM publications
WHERE inserted_at >= '2014-06-20 19:30:25' AND cluster_id IS NULL
ORDER BY inserted_at
LIMIT 0, 200;

where inserted_at is of type timestamp.

Edit:

Output of EXPLAIN when using id >= 14032924480302800156:

id | select_type | table        | type | possible_keys      | key        | key_len | ref   | rows     | Extra
---+-------------+--------------+------+--------------------+------------+---------+-------+----------+------------
1  | SIMPLE      | publications | ref  | PRIMARY,cluster_id | cluster_id | 9       | const | 71647796 | Using where

Output of EXPLAIN when using inserted_at >= '2014-06-20 19:30:25':

id | select_type | table        | type | possible_keys          | key        | key_len | ref   | rows     | Extra
---+-------------+--------------+------+------------------------+------------+---------+-------+----------+------------
1  | SIMPLE      | publications | ref  | inserted_at,cluster_id | cluster_id | 9       | const | 71647796 | Using where
like image 807
François Beaune Avatar asked Jul 16 '15 09:07

François Beaune


People also ask

How do I check if a field is empty in mysql?

The IS NULL operator is used to test for empty values (NULL values).

IS NULL check in mysql?

The IS NULL constraint can be used whenever the column is empty and the symbol ( ' ') is used when there is empty value. mysql> SELECT * FROM ColumnValueNullDemo WHERE ColumnName IS NULL OR ColumnName = ' '; After executing the above query, the output obtained is.

How do I add NULL values in mysql workbench?

In HeidiSql, you can insert NULL by clicking on a cell, and then Ctrl+Shift+N.


1 Answers

There is some guesswork involved about MySQL using indexes in the wrong order. PRIMARY index seems to be treated in a completely different way than the others.

In a query with a primary key condition indexes PRIMARY and on cluster_id can be used. For some reason, MySQL ignored PRIMARY index and looks at an index on cluster_id first, where you have a condition: it should be NULL. That leaves us with a huge potentially unordered (NULLs everywhere!) set of rows to be filtered by id.

With the next query, however, it's different: PRIMARY index cannot be used at all, so MySQL figures what to use in a better way, apparently using an index on inserted_at first without any hints.

What it should actually do in first query is take PRIMARY index first (tell it to do so). I am not a MySQL user, all my guesswork is backed only by my own understanding of internal data structures. I don't know whether it can apply an index on cluster_id on top of the results, but creating a composite index and comparing performance with and without it may give clues on whether it's used.

like image 151
D-side Avatar answered Oct 16 '22 05:10

D-side