Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to improve performance of COUNT(DISTINCT field1)... GROUP BY field2?

I have the following query

EXPLAIN SELECT COUNT(DISTINCT ip_address) as ip_address, exec_date
    FROM requests
    GROUP BY exec_date;

id  select_type table       type        possible_keys   key         key_len ref      rows   Extra
1   SIMPLE      requests    range       NULL            daily_ips   263     NULL    488213  Using index for group-by (scanning)

With a covering index daily_ips

Table       Non_unique  Key_name    Seq_in_index    Column_name Collation   Cardinality Sub_part    Packed  Null    Index_type  Comment Index_comment
requests    1           daily_ips   1               exec_date   A           16          NULL        NULL    YES BTREE       
requests    1           daily_ips   2               ip_address  A           483492      NULL        NULL    YES BTREE       

Is there any way I can further optimize this query?

What exactly does Using index for group-by (scanning) mean? Does it mean that the entire GROUP BY clause is done entirely from an index while the COUNT(DISTINCT ip_address) part of the statement is not?

like image 960
user784637 Avatar asked Nov 27 '12 01:11

user784637


1 Answers

Based on the data you've provided, I don't see any way you can further optimize the query.

As to your follow-up question, MySQL's manual page describing explain output for Using index for group-by says:

Similar to the Using index table access method, Using index for group-by indicates that MySQL found an index that can be used to retrieve all columns of a GROUP BY or DISTINCT query without any extra disk access to the actual table. Additionally, the index is used in the most efficient way so that for each group, only a few index entries are read. For details, see Section 8.13.10, “GROUP BY Optimization”.

Your index is particularly well-suited to speeding up your query. Because only indexed fields are being selected (each column in your query also appears in the index), MySQL may not even need to hit the table at all, as all the relevant data appears in the index.

If executing a query were like performing a search on google, imagine not having to click through to any of the linked sites, because you found the information you were looking for directly in the search results - that's sort of like what not needing to scan the table data is like. Here is some more information on how MySQL uses indexes:

In some cases, a query can be optimized to retrieve values without consulting the data rows. (An index that provides all the necessary results for a query is called a covering index.) If a query uses only columns from a table that are numeric and that form a leftmost prefix for some key, the selected values can be retrieved from the index tree for greater speed:

SELECT key_part3 FROM tbl_name WHERE key_part1=1

like image 193
Daniel Miladinov Avatar answered Sep 30 '22 17:09

Daniel Miladinov