So I have a vps with 512mb ram, and a MySQL table like this:
CREATE TABLE `table1` (
`id` int(20) unsigned NOT NULL auto_increment,
`ts` timestamp NOT NULL default CURRENT_TIMESTAMP,
`value1` char(31) collate utf8_unicode_ci default NULL,
`value2` varchar(100) collate utf8_unicode_ci default NULL,
`value3` varchar(100) collate utf8_unicode_ci default NULL,
`value4` mediumtext collate utf8_unicode_ci,
`type` varchar(30) collate utf8_unicode_ci NOT NULL,
PRIMARY KEY (`id`),
KEY `type` (`type`),
KEY `date` (`ts`)
) ENGINE=MyISAM AUTO_INCREMENT=469692 DEFAULT CHARSET=utf8
COLLATE=utf8_unicode_ci
If I execute a query like this, it takes 2~18 seconds to complete:
SELECT `id`, `ts`, `value1`, `value2`, `value3` FROM table1 WHERE
`type` = 'something' ORDER BY `id` DESC limit 0,10;
EXPLAIN SELECT tells me:
select_type: SIMPLE
type: ref
possible_keys: type
key: type
key_len: 92
ref: const
rows: 7291
Extra: Using where; Using filesort
I thought the 'using filesort' might be the problem, but it turns out
that's not the case. If I remove the ORDER BY and the LIMIT, the
query speed is the same (I turn off query cache for the testing with
SET @@query_cache_type=0;
).
mysql> EXPLAIN SELECT `id`,`ts`,`value1`,`value2`, `value3`
FROM table1 WHERE `type` = 'something'\G
select_type: SIMPLE
type: ref
possible_keys: type
key: type
key_len: 92
ref: const
rows: 7291
Extra: Using where
Don't know if it matters but the rows approximation is inaccurate:
SELECT COUNT(*) FROM table1 WHERE `type` = 'something';
Returns 22.8k rows.
The query seems already optimized, I don't know how I could further improve it. The whole table contains 370k rows, and is about 4.6 GiB in size. Could it be possible that because the type is randomly changing row by row (randomly distributed in the whole table), it takes 2~18 seconds just to fetch the data from disk?
The funny thing is when I use a type that only has a few hundred rows, those queries are slow too. MySQL returns rows at about 100 rows/sec!
|-------+------+-----------|
| count | time | row/sec |
|-------+------+-----------|
| 22802 | 18.7 | 1219.3583 |
| 11 | 0.1 | 110. |
| 491 | 4.8 | 102.29167 |
| 705 | 5.6 | 125.89286 |
| 317 | 2.6 | 121.92308 |
|-------+------+-----------|
Why is it so slow? Can I further optimize the query? Should I move the data to smaller tables?
I thought automatic partitioning would be a good idea, to make a new partition for every type dynamically. That is not possible, for many reasons including that the maximum partition number is 1024, and there can be any types. I could also try application level partitioning, creating a new table for every new type. I wouldn't want to do that as it introduces great complexity. I don't know how I could have a unique id for all rows in all tables. Also, if I reach multiple inserts/second, performance would drop significantly.
Thanks in advance.
1. Always reduce the data before any joins as much possible. 2. When joining, make sure smaller tables are on the left side of join syntax, which makes this data set to be in memory / broadcasted to all the vertica nodes and makes join faster.
Joins: If your query joins two tables in a way that substantially increases the row count of the result set, your query is likely to be slow. There's an example of this in the subqueries lesson. Aggregations: Combining multiple rows to produce a result requires more computation than simply retrieving those rows.
The SQL COUNT( ) function is used to return the number of rows in a table. It is used with the Select( ) statement.
You need a multi-column index for that query:
KEY `typeid` (`type`, `id`)
Unfortunately, as you stated, it is also slow without the ORDER so it's slow because the records are scattered around on the disk and it has to do a lot of seeks. Once cached, it should be quite fast (Note: 22.8/370 * 4.6G = 283M, so if you do other activities/queries those record won't be in the memory for long time or might not even fit.).
Do an iostat 1
to verify the I/O bottleneck. Loads of RAM could solve your problem. An SSD could also solve your problem. But RAM is cheaper ;)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With