Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

how does MySQL implement the "group by"?

Tags:

mysql

I read from the MySQL Reference Manual and find that when it can take use of index,it just do index scan,other it will create tmp tables and do things like filesort. And I also read from other article that the "Group By" result will sort by group by columns by default,if "order by null" clause added,it won't don filesort. The difference can be found from the "explain ..." clause. so my problem is:what is the difference between "group by" clause that with "order by null" and which doesn't have? I try to use profiling to see what mysql do on the background,and only see result like:

result for group clause without order by null:
|preparing                      | 0.000016 | 
| Creating tmp table             | 0.000048 | 
| executing                      | 0.000009 | 
| Copying to tmp table           | 0.000109 | 
**| Sorting result                 | 0.000023 |** 
| Sending data                   | 0.000027 | 

result for clause with "order by null":
preparing                      | 0.000016 | 
| Creating tmp table             | 0.000052 | 
| executing                      | 0.000009 | 
| Copying to tmp table           | 0.000114 | 
| Sending data                   | 0.000028 | 

So I guess what MySQL do when the "order by null" added,it does not use filesort algorithm,maybe when it creates the tmp table,it uses index as well,and then use the index to do group by operation,when completed,it just read result from the table rows and does not sort the result.

But my original opinion is that MySQL can use quicksort to sort the items and then do group by,so the result will be sorted as well.

Any opinion appreciated,thanks.

like image 990
user188916 Avatar asked Mar 17 '10 02:03

user188916


1 Answers

mysql> select max(post_date),post_author from wp_posts
-> where id > 10 and id < 1000
-> group by post_author;
+———————+————-+
| max(post_date) | post_author |
+———————+————-+
| 2009-07-03 12:58:39 | 1 |
+———————+————-+
1 row in set (0.01 sec)

mysql> show profiles;
+———-+————+————————+
| Query_ID | Duration | Query |
+———-+————+————————+
| 1 | 0.00013200 | SELECT DATABASE() |
| 2 | 0.00030900 | show databases |
| 3 | 0.00030400 | show tables |
| 4 | 0.01180000 | select max(post_date),post_author from wp_posts where id > 10 and id < 1000 group by post_author |4 rows in set (0.00 sec)

mysql> show profile cpu,block io for query 4;
+———————-+———-+———-+————+————–+—————+
| Status | Duration | CPU_user | CPU_system | Block_ops_in | Block_ops_out |
+———————-+———-+———-+————+————–+—————+
| starting | 0.000085 | 0.000000 | 0.000000 | 0 | 0 |
| Opening tables | 0.000010 | 0.000000 | 0.000000 | 0 | 0 |
| System lock | 0.000005 | 0.000000 | 0.000000 | 0 | 0 |
| Table lock | 0.000008 | 0.000000 | 0.000000 | 0 | 0 |
| init | 0.000029 | 0.000000 | 0.000000 | 0 | 0 |
| optimizing | 0.000014 | 0.000000 | 0.000000 | 0 | 0 |
| statistics | 0.000062 | 0.000000 | 0.000000 | 0 | 0 |
| preparing | 0.000016 | 0.000000 | 0.000000 | 0 | 0 |
| Creating tmp table | 0.000035 | 0.000000 | 0.000000 | 0 | 0 |
| executing | 0.000004 | 0.000000 | 0.000000 | 0 | 0 |
| Copying to tmp table | 0.011386 | 0.004999 | 0.006999 | 0 | 0 |
| Sorting result | 0.000044 | 0.000000 | 0.000000 | 0 | 0 |
| Sending data | 0.000036 | 0.000000 | 0.000000 | 0 | 0 |
| end | 0.000004 | 0.000000 | 0.000000 | 0 | 0 |
| removing tmp table | 0.000012 | 0.000000 | 0.000000 | 0 | 0 |
| end | 0.000004 | 0.000000 | 0.000000 | 0 | 0 |
| end | 0.000004 | 0.000000 | 0.000000 | 0 | 0 |
| query end | 0.000004 | 0.000000 | 0.000000 | 0 | 0 |
| freeing items | 0.000013 | 0.000000 | 0.000000 | 0 | 0 |
| closing tables | 0.000018 | 0.000000 | 0.000000 | 0 | 0 |
| logging slow query | 0.000003 | 0.000000 | 0.000000 | 0 | 0 |
| cleaning up | 0.000004 | 0.000000 | 0.000000 | 0 | 0 |
+———————-+———-+———-+————+————–+—————+
22 rows in set (0.00 sec)

mysql>
mysql>
mysql> select max(post_date),post_author from wp_posts
-> where id > 10 and id < 1000
-> group by post_author order by null;
+———————+————-+
| max(post_date) | post_author |
+———————+————-+
| 2009-07-03 12:58:39 | 1 |
+———————+————-+
1 row in set (0.01 sec)

mysql> show profiles;
+———-+————+—————–+
| Query_ID | Duration | Query
+———-+————+—————–+
|1 | 0.00013200 | SELECT DATABASE()
|2 | 0.00030900 | show databases
|3 | 0.00030400 | show tables
|4 | 0.01180000 | select max(post_date),post_author from wp_posts where id > 10 and id < 1000 group by post_author
|5 | 0.01177700 | select max(post_date),post_author from wp_posts where id > 10 and id < 1000 group by post_author order by null
5 rows in set (0.00 sec)
mysql> show profile cpu,block io for query 5;
+———————-+———-+———-+————+————–+—————+
| Status | Duration | CPU_user | CPU_system | Block_ops_in | Block_ops_out |
+———————-+———-+———-+————+————–+—————+
| starting | 0.000097 | 0.000000 | 0.000000 | 0 | 0 |
| Opening tables | 0.000013 | 0.000000 | 0.000000 | 0 | 0 |
| System lock | 0.000006 | 0.000000 | 0.000000 | 0 | 0 |
| Table lock | 0.000008 | 0.000000 | 0.000000 | 0 | 0 |
| init | 0.000032 | 0.000000 | 0.000000 | 0 | 0 |
| optimizing | 0.000012 | 0.000000 | 0.000000 | 0 | 0 |
| statistics | 0.000065 | 0.000000 | 0.000000 | 0 | 0 |
| preparing | 0.000017 | 0.000000 | 0.000000 | 0 | 0 |
| Creating tmp table | 0.000040 | 0.000000 | 0.000000 | 0 | 0 |
| executing | 0.000003 | 0.000000 | 0.000000 | 0 | 0 |
| Copying to tmp table | 0.011369 | 0.005999 | 0.004999 | 0 | 0 |
| Sending data | 0.000040 | 0.000000 | 0.000000 | 0 | 0 |
| end | 0.000004 | 0.000000 | 0.000000 | 0 | 0 |
| removing tmp table | 0.000031 | 0.000000 | 0.000000 | 0 | 0 |
| end | 0.000005 | 0.000000 | 0.000000 | 0 | 0 |
| end | 0.000004 | 0.000000 | 0.000000 | 0 | 0 |
| query end | 0.000004 | 0.000000 | 0.000000 | 0 | 0 |
| freeing items | 0.000012 | 0.000000 | 0.000000 | 0 | 0 |
| closing tables | 0.000009 | 0.000000 | 0.000000 | 0 | 0 |
| logging slow query | 0.000003 | 0.000000 | 0.000000 | 0 | 0 |
| cleaning up | 0.000003 | 0.000000 | 0.000000 | 0 | 0 |
+———————-+———-+———-+————+————–+—————+
21 rows in set (0.00 sec)

From here we can see that The second part does't have the "Sorting result" step,so a little impact on performance.

like image 134
oyishi Avatar answered Nov 14 '22 16:11

oyishi