Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Select top n rows efficiently

Tags:

mysql

So I have a table, possibly millions of rows long,

user   | points
---------------
user1  | 10
user2  | 12
user3  | 7
...

and want to SELECT * FROM mytable ORDER BY points LIMIT 100, 1000 Now that works fine, but is horribly slow (on huge tables), since it refuses to use any kind of index, but performs a full table scan. How can I make this more efficient?

My first (obvious) idea was to use an index on points DESC, but then I figured out that MySQL does not support those at all.

Next, I tried to reverse the sign on points, meaning essentially having an ascending index on -points, this didnt help either, since it doesnt use the index for sorting

Lastly, I tried using force index, this yielded barely any performance improvement, since it still fetches the entire table, yet doesnt sort (using filesort: false in EXPLAIN)

I am sure this must be a solved problem, but I did not find any helpful information online. Any hints would be greatly appreciated.

like image 537
CBenni Avatar asked Feb 18 '26 19:02

CBenni


1 Answers

Some ways to get better performance from a query.

Never never use SELECT *. It's a rookie mistake. It basically tells the query planner it needs to give you everything. Always enumerate the columns you want in the result set. This is the query you want (assuming you haven't oversimplified your question).

 SELECT user, points
   FROM table
  ORDER BY points
  LIMIT 100,1000

Use a compound index. In the case of your query, a compound index on (points, user) will allow the use of a partial index scan to satisfy your query. That should be faster than a full table sort. MySQL can scan indexes backward or forward, so you don't need to worry about descending order

To add the correct index use a command like this.

ALTER TABLE table ADD INDEX points_user (points, user);

Edit. The suggestion against using SELECT * here is based on (1) my unconfirmed suspicion that the table in question is oversimplified and has other columns in real life, and (2) the inconvenient reality that sometimes the index has to match the query precisely to get best performance results.

I stand by my opinion, based on experience, that using SELECT * in queries with performance sensitivity is not good engineering practice (unless you like the query so much you want to come back to it again and again).

like image 188
O. Jones Avatar answered Feb 21 '26 12:02

O. Jones