Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Slow MySQL SELECT on large table

Tags:

sql

mysql

I have a table for storing prices over time of ~35k items every 15 minutes for 2 weeks.. Roughly it translates to about 35 million rows in the table. I'm trying to perform the simplest of queries:

SELECT buy_price, sell_price, created_at FROM price_archive WHERE item_id = X

The first, uncached run of that query takes around 4-7 seconds to return ~1300 rows (per item). This seems ridiculously slow for something so trivial for a database, especially considering there is on index on the item_id column.

The table has 35k rows inserted every 15 minutes, and each day a task runs to delete items from < (2 weeks - 1) ago (to stop the table from growing too much). I suspect this fragments the table quite a bit, but would that fragmentation cause the query to perform so poorly? If yes, would partitioning on created_at for the deletion of old data make it better?

1306 rows in set (8.32 sec)

mysql> explain select * from price_archives where item_id = 743;
+----+-------------+----------------+------+---------------------------------+---------------------------------+---------+-------+------+-------------+
| id | select_type | table          | type | possible_keys                   | key                             | key_len | ref   | rows | Extra       |
+----+-------------+----------------+------+---------------------------------+---------------------------------+---------+-------+------+-------------+
|  1 | SIMPLE      | price_archives | ref  | index_price_archives_on_item_id | index_price_archives_on_item_id | 5       | const | 1305 | Using where |
+----+-------------+----------------+------+---------------------------------+---------------------------------+---------+-------+------+-------------+
like image 326
Supremacy Avatar asked Oct 01 '22 00:10

Supremacy


1 Answers

This is your query:

SELECT buy_price, sell_price, created_at
FROM price_archive
WHERE item_id = X;

The best index for this query is the composite index: price_archive(item_id, buy_price, sell_price, created_at). This is a "covering" index that can satisfy the query. It comes with a downside, though. This index may slow the inserts that you are doing in the table. 140k rows every hour is a lot of data, but this maintaining this index should not be so bad.

You are facing a not-uncommon challenge in databases. The problem with your query is that the 1300 or so rows being returned are all on different data pages. In all likelihood, the table does not fit into memory on your machine, so this results in about 1300 accesses to the files on disk. That explains why you are seeing a lag time of several seconds.

Another solution is to ensure that the data tables themselves fit into memory. Although the first uncached query will take a bit of time, subsequent queries should be quite fast.

like image 151
Gordon Linoff Avatar answered Oct 11 '22 15:10

Gordon Linoff