Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

MySQL: How does groupby work on columns without aggregate functions?

Tags:

mysql

group-by

I am somewhat confused about how the group by command works in mysql.

Suppose I have a table:

mysql> select recordID, IPAddress, date, httpMethod from Log_Analysis_Records_dalhousieShort;                   
+----------+-----------------+---------------------+-------------------------------------------------+
| recordID | IPAddress       | date                | httpMethod                                      |
+----------+-----------------+---------------------+-------------------------------------------------+
|        1 | 64.68.88.22     | 2003-07-09 00:00:21 | GET /news/science/cancer.shtml HTTP/1.0         | 
|        2 | 64.68.88.166    | 2003-07-09 00:00:55 | GET /news/internet/xml.shtml HTTP/1.0           | 
|        3 | 129.173.177.214 | 2003-07-09 00:01:23 | GET / HTTP/1.1                                  | 
|        4 | 129.173.177.214 | 2003-07-09 00:01:23 | GET /include/fcs_style.css HTTP/1.1             | 
|        5 | 129.173.177.214 | 2003-07-09 00:01:23 | GET /include/main_page.css HTTP/1.1             | 
|        6 | 129.173.177.214 | 2003-07-09 00:01:23 | GET /images/bigportaltopbanner.gif HTTP/1.1     | 
|        7 | 129.173.177.214 | 2003-07-09 00:01:23 | GET /images/right_1.jpg HTTP/1.1                | 
|        8 | 64.68.88.165    | 2003-07-09 00:02:43 | GET /studentservices/responsible.shtml HTTP/1.0 | 
|        9 | 64.68.88.165    | 2003-07-09 00:02:44 | GET /news/sports/basketball.shtml HTTP/1.0      | 
|       10 | 64.68.88.34     | 2003-07-09 00:02:46 | GET /news/science/space.shtml HTTP/1.0          | 
|       11 | 129.173.159.98  | 2003-07-09 00:03:46 | GET / HTTP/1.1                                  | 
|       12 | 129.173.159.98  | 2003-07-09 00:03:46 | GET /include/fcs_style.css HTTP/1.1             | 
|       13 | 129.173.159.98  | 2003-07-09 00:03:46 | GET /include/main_page.css HTTP/1.1             | 
|       14 | 129.173.159.98  | 2003-07-09 00:03:48 | GET /images/bigportaltopbanner.gif HTTP/1.1     | 
|       15 | 129.173.159.98  | 2003-07-09 00:03:48 | GET /images/left_1g.jpg HTTP/1.1                | 
|       16 | 129.173.159.98  | 2003-07-09 00:03:48 | GET /images/webcam.gif HTTP/1.1                 | 
+----------+-----------------+---------------------+-------------------------------------------------+

When I am execute this statement how does it choose which recordID to include since there are a range of recordIDs that would be correct? Does it just choose the first one that matches?

mysql> select recordID, IPAddress, date, httpMethod from Log_Analysis_Records_dalhousieShort GROUP BY IPADDRESS;
+----------+-----------------+---------------------+-------------------------------------------------+
| recordID | IPAddress       | date                | httpMethod                                      |
+----------+-----------------+---------------------+-------------------------------------------------+
|       11 | 129.173.159.98  | 2003-07-09 00:03:46 | GET / HTTP/1.1                                  | 
|        3 | 129.173.177.214 | 2003-07-09 00:01:23 | GET / HTTP/1.1                                  | 
|        8 | 64.68.88.165    | 2003-07-09 00:02:43 | GET /studentservices/responsible.shtml HTTP/1.0 | 
|        2 | 64.68.88.166    | 2003-07-09 00:00:55 | GET /news/internet/xml.shtml HTTP/1.0           | 
|        1 | 64.68.88.22     | 2003-07-09 00:00:21 | GET /news/science/cancer.shtml HTTP/1.0         | 
|       10 | 64.68.88.34     | 2003-07-09 00:02:46 | GET /news/science/space.shtml HTTP/1.0          | 
+----------+-----------------+---------------------+-------------------------------------------------+
6 rows in set (0.00 sec)

For this table the max(date) and min(date) values seem logical to me but I am confused about how the recordID and httpMethod where chosen.

Is it safe use two aggregate functions in one command?

mysql> select recordID, IPAddress, min(date), max(date), httpMethod from Log_Analysis_Records_dalhousieShort GROUP BY IPADDRESS;
+----------+-----------------+---------------------+---------------------+-------------------------------------------------+
| recordID | IPAddress       | min(date)           | max(date)           | httpMethod                                      |
+----------+-----------------+---------------------+---------------------+-------------------------------------------------+
|       11 | 129.173.159.98  | 2003-07-09 00:03:46 | 2003-07-09 00:03:48 | GET / HTTP/1.1                                  | 
|        3 | 129.173.177.214 | 2003-07-09 00:01:23 | 2003-07-09 00:01:23 | GET / HTTP/1.1                                  | 
|        8 | 64.68.88.165    | 2003-07-09 00:02:43 | 2003-07-09 00:02:44 | GET /studentservices/responsible.shtml HTTP/1.0 | 
|        2 | 64.68.88.166    | 2003-07-09 00:00:55 | 2003-07-09 00:00:55 | GET /news/internet/xml.shtml HTTP/1.0           | 
|        1 | 64.68.88.22     | 2003-07-09 00:00:21 | 2003-07-09 00:00:21 | GET /news/science/cancer.shtml HTTP/1.0         | 
|       10 | 64.68.88.34     | 2003-07-09 00:02:46 | 2003-07-09 00:02:46 | GET /news/science/space.shtml HTTP/1.0          | 
+----------+-----------------+---------------------+---------------------+-------------------------------------------------+
6 rows in set (0.00 sec)
like image 300
sixtyfootersdude Avatar asked Nov 14 '10 17:11

sixtyfootersdude


People also ask

Can GROUP BY be used without aggregate functions?

You can use the GROUP BY clause without applying an aggregate function. The following query gets data from the payment table and groups the result by customer id. In this case, the GROUP BY works like the DISTINCT clause that removes duplicate rows from the result set.

Is aggregate function necessary with GROUP BY?

If you don't specify GROUP BY , aggregate functions operate over all the records selected. In that case, it doesn't make sense to also select a specific column like EmployeeID .

Is aggregate function mandatory for GROUP BY clause?

Expressions that are not encapsulated within an aggregate function and must be included in the GROUP BY Clause at the end of the SQL statement. This is an aggregate function such as the SUM, COUNT, MIN, MAX, or AVG functions. This is the column or expression that the aggregate_function will be used on.

What is non aggregated column MySQL?

If the ONLY_FULL_GROUP_BY SQL mode is enabled (which it is by default), MySQL rejects queries for which the select list, HAVING condition, or ORDER BY list refer to nonaggregated columns that are neither named in the GROUP BY clause nor are functionally dependent on them.


1 Answers

Usually use of GROUP BY while listing a field in the select expression without an aggregate function is invalid SQL and should throw an error.

MySQL, however, allows this and simply chooses one value randomly. Try to avoid it, because it is confusing.

To disallow this, you can say at runtime:

SET sql_mode := CONCAT('ONLY_FULL_GROUP_BY,',@@sql_mode);

or use the configuration value and/or command line option sql-mode.

Yes, listing two aggregate functions is completely valid.

like image 95
AndreKR Avatar answered Sep 19 '22 03:09

AndreKR