Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to improve an already optimized query that takes 18 seconds?

So I have a vps with 512mb ram, and a MySQL table like this:

CREATE TABLE `table1` (
  `id` int(20) unsigned NOT NULL auto_increment,
  `ts` timestamp NOT NULL default CURRENT_TIMESTAMP,
  `value1` char(31) collate utf8_unicode_ci default NULL,
  `value2` varchar(100) collate utf8_unicode_ci default NULL,
  `value3` varchar(100) collate utf8_unicode_ci default NULL,
  `value4` mediumtext collate utf8_unicode_ci,
  `type` varchar(30) collate utf8_unicode_ci NOT NULL,
  PRIMARY KEY  (`id`),
  KEY `type` (`type`),
  KEY `date` (`ts`)
) ENGINE=MyISAM AUTO_INCREMENT=469692 DEFAULT CHARSET=utf8
  COLLATE=utf8_unicode_ci

If I execute a query like this, it takes 2~18 seconds to complete:

SELECT `id`, `ts`, `value1`, `value2`, `value3` FROM table1 WHERE
`type` = 'something' ORDER BY `id` DESC limit 0,10; 

EXPLAIN SELECT tells me:

  select_type: SIMPLE
         type: ref
possible_keys: type
          key: type
      key_len: 92
          ref: const
         rows: 7291
        Extra: Using where; Using filesort

I thought the 'using filesort' might be the problem, but it turns out that's not the case. If I remove the ORDER BY and the LIMIT, the query speed is the same (I turn off query cache for the testing with SET @@query_cache_type=0;).

mysql> EXPLAIN SELECT `id`,`ts`,`value1`,`value2`, `value3` 
       FROM table1 WHERE `type` = 'something'\G

  select_type: SIMPLE
         type: ref
possible_keys: type
          key: type
      key_len: 92
          ref: const
         rows: 7291
        Extra: Using where

Don't know if it matters but the rows approximation is inaccurate:

SELECT COUNT(*) FROM table1 WHERE `type` = 'something';

Returns 22.8k rows.

The query seems already optimized, I don't know how I could further improve it. The whole table contains 370k rows, and is about 4.6 GiB in size. Could it be possible that because the type is randomly changing row by row (randomly distributed in the whole table), it takes 2~18 seconds just to fetch the data from disk?

The funny thing is when I use a type that only has a few hundred rows, those queries are slow too. MySQL returns rows at about 100 rows/sec!

|-------+------+-----------|
| count | time |   row/sec |
|-------+------+-----------|
| 22802 | 18.7 | 1219.3583 |
|    11 |  0.1 |      110. |
|   491 |  4.8 | 102.29167 |
|   705 |  5.6 | 125.89286 |
|   317 |  2.6 | 121.92308 |
|-------+------+-----------|

Why is it so slow? Can I further optimize the query? Should I move the data to smaller tables?

I thought automatic partitioning would be a good idea, to make a new partition for every type dynamically. That is not possible, for many reasons including that the maximum partition number is 1024, and there can be any types. I could also try application level partitioning, creating a new table for every new type. I wouldn't want to do that as it introduces great complexity. I don't know how I could have a unique id for all rows in all tables. Also, if I reach multiple inserts/second, performance would drop significantly.

Thanks in advance.

like image 384
atlau Avatar asked Aug 06 '11 17:08

atlau


People also ask

How do I speed up a join query?

1. Always reduce the data before any joins as much possible. 2. When joining, make sure smaller tables are on the left side of join syntax, which makes this data set to be in memory / broadcasted to all the vertica nodes and makes join faster.

Do joins slow down query?

Joins: If your query joins two tables in a way that substantially increases the row count of the result set, your query is likely to be slow. There's an example of this in the subqueries lesson. Aggregations: Combining multiple rows to produce a result requires more computation than simply retrieving those rows.

Which of the following is used to retrieve number of rows in a MySQL table?

The SQL COUNT( ) function is used to return the number of rows in a table. It is used with the Select( ) statement.


1 Answers

You need a multi-column index for that query:

KEY `typeid` (`type`, `id`)

Unfortunately, as you stated, it is also slow without the ORDER so it's slow because the records are scattered around on the disk and it has to do a lot of seeks. Once cached, it should be quite fast (Note: 22.8/370 * 4.6G = 283M, so if you do other activities/queries those record won't be in the memory for long time or might not even fit.).

Do an iostat 1 to verify the I/O bottleneck. Loads of RAM could solve your problem. An SSD could also solve your problem. But RAM is cheaper ;)

like image 51
Karoly Horvath Avatar answered Oct 08 '22 13:10

Karoly Horvath