Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

GROUP BY query optimization

Database is MySQL with MyISAM engine.

Table definition:

CREATE TABLE IF NOT EXISTS  matches  (
   id  int(11) NOT NULL AUTO_INCREMENT,
   game  int(11) NOT NULL,
   user  int(11) NOT NULL,
   opponent  int(11) NOT NULL,
   tournament  int(11) NOT NULL,
   score  int(11) NOT NULL,
   finish  tinyint(4) NOT NULL,
  PRIMARY KEY ( id ),
  KEY  game  ( game ),
  KEY  user  ( user ),
  KEY  i_gfu ( game , finish , user )
) ENGINE=MyISAM  DEFAULT CHARSET=latin1 AUTO_INCREMENT=3149047 ;

I have set an index on (game, finish, user) but this GROUP BY query still needs 0.4 - 0.6 seconds to run:

SELECT user AS player
     , COUNT( id ) AS times
FROM matches
WHERE finish = 1
  AND game = 19
GROUP BY user
ORDER BY times DESC

The EXPLAIN output:

| id | select_type | table   | type | possible_keys | key   | key_len | 
|  1 |  SIMPLE     | matches |  ref | game,i_gfu    | i_gfu |    5    | 

|  ref        |   rows |   Extra                                      |
| const,const | 155855 | Using where; Using temporary; Using filesort |

Is there any way I can make it faster? The table has about 800K records.


EDIT: I changed COUNT(id) into COUNT(*) and the time dropped to 0.08 - 0.12 seconds. I think I've tried that before making the index and forgot to change it again after.

In the explain output the Using index explains the speeding up:

|   rows |   Extra                                                   |
| 168029 | Using where; Using index; Using temporary; Using filesort |

(Side question: is this dropping of a factor of 5 normal?)

There are about 2000 users, so the final sorting, even if it uses filesort, it doesn't hurt performance. I tried without ORDER BY and it still takes almost same time.

like image 839
ypercubeᵀᴹ Avatar asked May 20 '11 12:05

ypercubeᵀᴹ


3 Answers

Get rid of 'game' key - it's redundant with 'i_gfu'. As 'id' is unique count(id) just returns number of rows in each group, so you can get rid of that and replace it with count(*). Try it that way and paste output of EXPLAIN:

SELECT user AS player, COUNT(*) AS times
FROM matches
WHERE finish = 1
AND game = 19
GROUP BY user
ORDER BY times DESC
like image 145
matt Avatar answered Sep 19 '22 10:09

matt


One of the shortcomings of this query is that you order by an aggregate. That means that you can't return any rows until the full result set has been generated; no index can exist (for mysql myisam, anyway) to fix that.

You can denormalize your data fairly easily to overcome this, though; You could, for instance, add an insert/update trigger to stick a count value in a summary table, with an index, so that you can start returning rows immediately.

like image 29
SingleNegationElimination Avatar answered Sep 19 '22 10:09

SingleNegationElimination


Eh, tough. Try reordering your index: put the user column first (so make the index (user, finish, game)) as that increases the chance the GROUP BY can use the index. However, in general GROUP BY can only use indexes if you limit the aggregate functions used to MIN and MAX (see http://dev.mysql.com/doc/refman/5.0/en/group-by-optimization.html and http://dev.mysql.com/doc/refman/5.5/en/loose-index-scan.html). Your order by isn't really helping either.

like image 45
Femi Avatar answered Sep 18 '22 10:09

Femi