Need help understanding how mysql indexes work

Tags:

I have a table that looks like this:

CREATE TABLE `metric` (
  `metricid` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
  `host` varchar(50) NOT NULL,
  `userid` int(10) unsigned DEFAULT NULL,
  `lastmetricvalue` double DEFAULT NULL,
  `receivedat` int(10) unsigned DEFAULT NULL,
  `name` varchar(255) NOT NULL,
  `sampleid` tinyint(3) unsigned NOT NULL,
  `type` tinyint(3) unsigned NOT NULL DEFAULT '0',
  `lastrawvalue` double NOT NULL,
  `priority` tinyint(3) unsigned NOT NULL DEFAULT '0',
  PRIMARY KEY (`metricid`),
  UNIQUE KEY `unique-metric` (`userid`,`host`,`name`,`sampleid`)
) ENGINE=InnoDB AUTO_INCREMENT=1000000221496 DEFAULT CHARSET=utf8

It has 177,892 rows at the moment, and when I run the following query:

select metricid, lastrawvalue, receivedat, name, sampleid
FROM metric m
WHERE m.userid = 8
  AND (host, name, sampleid) IN (('localhost','0.4350799184758216cpu-3/cpu-nice',0),
  ('localhost','0.4350799184758216cpu-3/cpu-system',0),
  ('localhost','0.4350799184758216cpu-3/cpu-idle',0),
  ('localhost','0.4350799184758216cpu-3/cpu-wait',0),
  ('localhost','0.4350799184758216cpu-3/cpu-interrupt',0),
  ('localhost','0.4350799184758216cpu-3/cpu-softirq',0),
  ('localhost','0.4350799184758216cpu-3/cpu-steal',0),
  ('localhost','0.4350799184758216cpu-4/cpu-user',0),
  ('localhost','0.4350799184758216cpu-4/cpu-nice',0),
  ('localhost','0.4350799184758216cpu-4/cpu-system',0),
  ('localhost','0.4350799184758216cpu-4/cpu-idle',0),
  ('localhost','0.4350799184758216cpu-4/cpu-wait',0),
  ('localhost','0.4350799184758216cpu-4/cpu-interrupt',0),
  ('localhost','0.4350799184758216cpu-4/cpu-softirq',0),
  ('localhost','0.4350799184758216cpu-4/cpu-steal',0),
  ('localhost','_util/billing-bytes',0),('localhost','_util/billing-metrics',0));

it takes 0.87 seconds to return results, explain is:

    *************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: m
         type: ref
possible_keys: unique-metric
          key: unique-metric
      key_len: 5
          ref: const
         rows: 85560
        Extra: Using where
1 row in set (0.00 sec)

profile looks like this:

+--------------------------------+----------+
| Status                         | Duration |
+--------------------------------+----------+
| starting                       | 0.000160 |
| checking permissions           | 0.000010 |
| Opening tables                 | 0.000021 |
| exit open_tables()             | 0.000008 |
| System lock                    | 0.000008 |
| mysql_lock_tables(): unlocking | 0.000005 |
| exit mysqld_lock_tables()      | 0.000007 |
| init                           | 0.000068 |
| optimizing                     | 0.000018 |
| statistics                     | 0.000091 |
| preparing                      | 0.000042 |
| executing                      | 0.000005 |
| Sending data                   | 0.870180 |
| innobase_commit_low():trx_comm | 0.000012 |
| Sending data                   | 0.000111 |
| end                            | 0.000009 |
| query end                      | 0.000009 |
| ha_commit_one_phase(-1)        | 0.000015 |
| innobase_commit_low():trx_comm | 0.000004 |
| ha_commit_one_phase(-1)        | 0.000005 |
| query end                      | 0.000005 |
| closing tables                 | 0.000012 |
| freeing items                  | 0.000562 |
| logging slow query             | 0.000005 |
| cleaning up                    | 0.000005 |
| sleeping                       | 0.000006 |
+--------------------------------+----------+

Which seems way too high for me. I've tried to replace the userid = 8 and (host, name, sampleid) IN part of the first query to (userid, host, name, sampleid) IN and this query runs about 0.5s - almost 2 times quicker, for reference, here's the query:

select metricid, lastrawvalue, receivedat, name, sampleid
FROM metric m
WHERE (userid, host, name, sampleid) IN ((8,'localhost','0.4350799184758216cpu-3/cpu-nice',0),
  (8,'localhost','0.4350799184758216cpu-3/cpu-system',0),
  (8,'localhost','0.4350799184758216cpu-3/cpu-idle',0),
  (8,'localhost','0.4350799184758216cpu-3/cpu-wait',0),
  (8,'localhost','0.4350799184758216cpu-3/cpu-interrupt',0),
  (8,'localhost','0.4350799184758216cpu-3/cpu-softirq',0),
  (8,'localhost','0.4350799184758216cpu-3/cpu-steal',0),
  (8,'localhost','0.4350799184758216cpu-4/cpu-user',0),
  (8,'localhost','0.4350799184758216cpu-4/cpu-nice',0),
  (8,'localhost','0.4350799184758216cpu-4/cpu-system',0),
  (8,'localhost','0.4350799184758216cpu-4/cpu-idle',0),
  (8,'localhost','0.4350799184758216cpu-4/cpu-wait',0),
  (8,'localhost','0.4350799184758216cpu-4/cpu-interrupt',0),
  (8,'localhost','0.4350799184758216cpu-4/cpu-softirq',0),
  (8,'localhost','0.4350799184758216cpu-4/cpu-steal',0),
  (8,'localhost','_util/billing-bytes',0),
  (8,'localhost','_util/billing-metrics',0));

its explain looks like this:

*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: m
         type: ALL
possible_keys: NULL
          key: NULL
      key_len: NULL
          ref: NULL
         rows: 171121
        Extra: Using where
1 row in set (0.00 sec)

Next I've updated the table to contain a single joined column:

alter table `metric` add `forindex` varchar(120) not null default '';
update metric set forindex = concat(userid,`host`,`name`,sampleid);
alter table metric add index `forindex` (`forindex`);

Updated the query to have only 1 string searched:

select metricid, lastrawvalue, receivedat, name, sampleid
FROM metric m
WHERE (forindex) IN (('8localhost0.4350799184758216cpu-3/cpu-nice0'),
  ('8localhost0.4350799184758216cpu-3/cpu-system0'),
  ('8localhost0.4350799184758216cpu-3/cpu-idle0'),
  ('8localhost0.4350799184758216cpu-3/cpu-wait0'),
  ('8localhost0.4350799184758216cpu-3/cpu-interrupt0'),
  ('8localhost0.4350799184758216cpu-3/cpu-softirq0'),
  ('8localhost0.4350799184758216cpu-3/cpu-steal0'),
  ('8localhost0.4350799184758216cpu-4/cpu-user0'),
  ('8localhost0.4350799184758216cpu-4/cpu-nice0'),
  ('8localhost0.4350799184758216cpu-4/cpu-system0'),
  ('8localhost0.4350799184758216cpu-4/cpu-idle0'),
  ('8localhost0.4350799184758216cpu-4/cpu-wait0'),
  ('8localhost0.4350799184758216cpu-4/cpu-interrupt0'),
  ('8localhost0.4350799184758216cpu-4/cpu-softirq0'),
  ('8localhost0.4350799184758216cpu-4/cpu-steal0'),
  ('8localhost_util/billing-bytes0'),
  ('8localhost_util/billing-metrics0'));

And now I get the same results in 0.00 sec! Explain is:

*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: m
         type: range
possible_keys: forindex
          key: forindex
      key_len: 362
          ref: NULL
         rows: 17
        Extra: Using where
1 row in set (0.00 sec)

So to summarize, here are the results:

m.userid = X AND (host, name, sampleid) IN - index used, 85560 rows scanned, runs in 0.9s
(userid, host, name, sampleid) IN - index not used, 171121 rows scanned, runs in 0.5s
additional column with compound index replaced with an index over a concatenated utility column - index used, 17 rows scanned, runs in 0s

Why does second query run faster than the first? And why is the third query so much faster than the rest? Should I keep such a column for the sole purpose of faster searching?

Mysql version is: mysqld Ver 5.5.34-55 for Linux on x86_64 (Percona XtraDB Cluster (GPL), wsrep_25.9.r3928)

751

asked Apr 10 '14 17:04

Fluffy

1 Answers

Indexes help your search terms in the WHERE clause by narrowing down the search as much as possible. You can see this happening...

The rows field of EXPLAIN gives an estimate of how many rows the query will have to examine to find the rows that match your query. By comparing the rows reported in each EXPLAIN, you can see how much better your better-optimized query is:

     rows: 85560 -- first query

     rows: 171121 -- second query examines 2x more rows, but it was probably 
                  -- faster because the data was buffered after the first query

     rows: 17 -- third query examines 5,000x fewer rows than first query

You would also notice in the SHOW PROFILE details if you ran that for the third query that "Sending data" is a lot faster for the quicker query. This process state indicates how long it took to copy rows from the storage engine up to the SQL layer of MySQL. Even when doing memory-to-memory copying, this takes a while for so many thousands of rows. This is why indexes are so beneficial.

For more useful explanation, see my presentation How to Design Indexes, Really.

197

answered Sep 18 '22 23:09

Bill Karwin

Related questions
                            
                                Connecting to mysql database with go
                            
                                SELECT min max and max min value from a part of a table in MySQL
                            
                                MySQL select or update acting very strange
                            
                                MySQL : why is left join slower then inner join? Optimization help required
                            
                                Python with MySql "SAWarning: Unicode type received non-unicode bind param value" error
                            
                                Similar names in a huge list
                            
                                MySQL: Eliminating duplicate rows without breaking a foreign key constraint
                            
                                Rails Octopus Gem - Master-Slave Replication connection behaviour when slave is down
                            
                                MySQL: Using COUNT function inside a SELECT statement
                            
                                SQL sort string with numbers certain pattern first
                            
                                SQL Get entire row based on minimum value of calculated column
                            
                                Select specific columns from SHOW ERRORS query
                            
                                Removing non-breaking spaces?
                            
                                How to use UUIDs instead of integers in MySQL DB
                            
                                How to Print the values obtained in a table after self joining two table in 'php'?
                            
                                Select last N# of columns of a table from MySQL
                            
                                Mysql Bulk update performance improvements for (when.. case)
                            
                                Will ResultSet leak if not explicitly closed?
                            
                                CakePHP 2 Composite Primary Key
                            
                                Mysql query to fetch limited rows of each type

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Need help understanding how mysql indexes work

Tags:

performance

indexing

mysql

innodb

Fluffy

People also ask

1 Answers

Bill Karwin

Recent Activity

Donate For Us