Why index made this query slower?

Tags:

1 Answers

Summary

The problem is that field is not a good candidate for indexing, due to the nature of b-trees.

Explanation

Let's suppose you have a table that has the results of 500,000 coin tosses, where the toss is either 1 (heads) or 0 (tails):

CREATE TABLE toss (
    id int NOT NULL AUTO_INCREMENT,
    result int NOT NULL DEFAULT '0',
    PRIMARY KEY ( id )
)

select result, count(*) from toss group by result order by result;
+--------+----------+
| result | count(*) |
+--------+----------+
|      0 |   250290 |
|      1 |   249710 |
+--------+----------+
2 rows in set (0.40 sec)

If you want to select one toss (at random) where the toss was tails, then you need to search through your table, picking a random starting place.

select * from toss where result != 1 limit 123456, 1;
+--------+--------+
| id     | result |
+--------+--------+
| 246700 |      0 |
+--------+--------+
1 row in set (0.06 sec)

explain select * from toss where result != 1 limit 123456, 1;
+----+-------------+-------+------+---------------+------+---------+------+--------+-------------+
| id | select_type | table | type | possible_keys | key  | key_len | ref  | rows   | Extra       |
+----+-------------+-------+------+---------------+------+---------+------+--------+-------------+
|  1 | SIMPLE      | toss  | ALL  | NULL          | NULL | NULL    | NULL | 500000 | Using where |
+----+-------------+-------+------+---------------+------+---------+------+--------+-------------+

You see that you're basically searching sequentially through all of the rows to find a match.

If you create an index on the toss field, then your index will contain two values, each with roughly 250,000 entries.

create index foo on toss ( result );
Query OK, 500000 rows affected (2.48 sec)
Records: 500000  Duplicates: 0  Warnings: 0

select * from toss where result != 1 limit 123456, 1;
+--------+--------+
| id     | result |
+--------+--------+
| 246700 |      0 |
+--------+--------+
1 row in set (0.25 sec)

explain select * from toss where result != 1 limit 123456, 1;
+----+-------------+-------+-------+---------------+------+---------+------+--------+-------------+
| id | select_type | table | type  | possible_keys | key  | key_len | ref  | rows   | Extra       |
+----+-------------+-------+-------+---------------+------+---------+------+--------+-------------+
|  1 | SIMPLE      | toss  | range | foo           | foo  | 4       | NULL | 154565 | Using where |
+----+-------------+-------+-------+---------------+------+---------+------+--------+-------------+

Now you're searching fewer records, but the time to search increased from 0.06 to 0.25 seconds. Why? Because sequentially scanning an index is actually less efficient than sequentially scanning a table, for indexes with a large number of rows for a given key.

Let's look at the indexes on this table:

show index from toss;
+-------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment |
+-------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
| toss  |          0 | PRIMARY  |            1 | id          | A         |      500000 |     NULL | NULL   |      | BTREE      |         |
| toss  |          1 | foo      |            1 | result      | A         |           2 |     NULL | NULL   |      | BTREE      |         |
+-------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+

The PRIMARY index is a good index: there are 500,000 rows, and there are 500,000 values. Arranged in a BTREE, you can quickly identify a single row based on the id.

The foo index is a bad index: there are 500,000 rows, but only 2 possible values. This is pretty much the worst possible case for a BTREE -- all of the overhead of searching the index, and still having to search through the results.

141

answered Oct 09 '22 09:10

Craig Trader

Related questions
                            
                                Error when using else if in sql trigger
                            
                                Fatal error: Call to undefined method DateTime::createfromformat()
                            
                                MYSQL - How to workaround the row size limit of 66 KBytes
                            
                                How to count items per category?
                            
                                Export and import data from PHPMYADMIN
                            
                                PHP Framework or not (Cake PHP)
                            
                                populate a MySQL with a large series of rows quickly
                            
                                syntax error on DECLARE CURSOR FOR
                            
                                MySql fulltext search in PHP using string containing keywords
                            
                                MySQL LOAD_FILE() loads null values
                            
                                MySQL connector 6.7.4 and Entity Framework 5 exceptions
                            
                                SQL Error 1064 - Import Error
                            
                                Run a query foreach databases (mysql)
                            
                                Why Mysql's Group By and Oracle's Group by behaviours are different
                            
                                mysql stored procedure error: missing semicolon
                            
                                NodeJS: Backup MySQL database
                            
                                How to make MySQL use less memory?
                            
                                phpMyAdmin Error in processing request Error code: 500 Error text: Internal Server Error [duplicate]
                            
                                Migrating a MySQL server from one box to another
                            
                                using mysql_close()

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Why index made this query slower?

Tags:

indexing

mysql

Qiao

People also ask

1 Answers

Craig Trader

Recent Activity

Donate For Us