I am somewhat confused about how the <code>group by</code> command works in mysql. Suppose I have a table: <pre class="prettyprint"><code>mysql> select recordID, IPAddress, date, httpMethod from Log_Analysis_Records_dalhousieShort; +----------+-----------------+---------------------+-------------------------------------------------+ | recordID | IPAddress | date | httpMethod | +----------+-----------------+---------------------+-------------------------------------------------+ | 1 | 64.68.88.22 | 2003-07-09 00:00:21 | GET /news/science/cancer.shtml HTTP/1.0 | | 2 | 64.68.88.166 | 2003-07-09 00:00:55 | GET /news/internet/xml.shtml HTTP/1.0 | | 3 | 129.173.177.214 | 2003-07-09 00:01:23 | GET / HTTP/1.1 | | 4 | 129.173.177.214 | 2003-07-09 00:01:23 | GET /include/fcs_style.css HTTP/1.1 | | 5 | 129.173.177.214 | 2003-07-09 00:01:23 | GET /include/main_page.css HTTP/1.1 | | 6 | 129.173.177.214 | 2003-07-09 00:01:23 | GET /images/bigportaltopbanner.gif HTTP/1.1 | | 7 | 129.173.177.214 | 2003-07-09 00:01:23 | GET /images/right_1.jpg HTTP/1.1 | | 8 | 64.68.88.165 | 2003-07-09 00:02:43 | GET /studentservices/responsible.shtml HTTP/1.0 | | 9 | 64.68.88.165 | 2003-07-09 00:02:44 | GET /news/sports/basketball.shtml HTTP/1.0 | | 10 | 64.68.88.34 | 2003-07-09 00:02:46 | GET /news/science/space.shtml HTTP/1.0 | | 11 | 129.173.159.98 | 2003-07-09 00:03:46 | GET / HTTP/1.1 | | 12 | 129.173.159.98 | 2003-07-09 00:03:46 | GET /include/fcs_style.css HTTP/1.1 | | 13 | 129.173.159.98 | 2003-07-09 00:03:46 | GET /include/main_page.css HTTP/1.1 | | 14 | 129.173.159.98 | 2003-07-09 00:03:48 | GET /images/bigportaltopbanner.gif HTTP/1.1 | | 15 | 129.173.159.98 | 2003-07-09 00:03:48 | GET /images/left_1g.jpg HTTP/1.1 | | 16 | 129.173.159.98 | 2003-07-09 00:03:48 | GET /images/webcam.gif HTTP/1.1 | +----------+-----------------+---------------------+-------------------------------------------------+ </code></pre> <hr> When I am execute this statement how does it choose which <code>recordID</code> to include since there are a range of <code>recordID</code>s that would be correct? Does it just choose the first one that matches? <pre class="prettyprint"><code>mysql> select recordID, IPAddress, date, httpMethod from Log_Analysis_Records_dalhousieShort GROUP BY IPADDRESS; +----------+-----------------+---------------------+-------------------------------------------------+ | recordID | IPAddress | date | httpMethod | +----------+-----------------+---------------------+-------------------------------------------------+ | 11 | 129.173.159.98 | 2003-07-09 00:03:46 | GET / HTTP/1.1 | | 3 | 129.173.177.214 | 2003-07-09 00:01:23 | GET / HTTP/1.1 | | 8 | 64.68.88.165 | 2003-07-09 00:02:43 | GET /studentservices/responsible.shtml HTTP/1.0 | | 2 | 64.68.88.166 | 2003-07-09 00:00:55 | GET /news/internet/xml.shtml HTTP/1.0 | | 1 | 64.68.88.22 | 2003-07-09 00:00:21 | GET /news/science/cancer.shtml HTTP/1.0 | | 10 | 64.68.88.34 | 2003-07-09 00:02:46 | GET /news/science/space.shtml HTTP/1.0 | +----------+-----------------+---------------------+-------------------------------------------------+ 6 rows in set (0.00 sec) </code></pre> <hr> For this table the <code>max(date)</code> and <code>min(date)</code> values seem logical to me but I am confused about how the <code>recordID</code> and <code>httpMethod</code> where chosen. Is it safe use two aggregate functions in one command? <pre class="prettyprint"><code>mysql> select recordID, IPAddress, min(date), max(date), httpMethod from Log_Analysis_Records_dalhousieShort GROUP BY IPADDRESS; +----------+-----------------+---------------------+---------------------+-------------------------------------------------+ | recordID | IPAddress | min(date) | max(date) | httpMethod | +----------+-----------------+---------------------+---------------------+-------------------------------------------------+ | 11 | 129.173.159.98 | 2003-07-09 00:03:46 | 2003-07-09 00:03:48 | GET / HTTP/1.1 | | 3 | 129.173.177.214 | 2003-07-09 00:01:23 | 2003-07-09 00:01:23 | GET / HTTP/1.1 | | 8 | 64.68.88.165 | 2003-07-09 00:02:43 | 2003-07-09 00:02:44 | GET /studentservices/responsible.shtml HTTP/1.0 | | 2 | 64.68.88.166 | 2003-07-09 00:00:55 | 2003-07-09 00:00:55 | GET /news/internet/xml.shtml HTTP/1.0 | | 1 | 64.68.88.22 | 2003-07-09 00:00:21 | 2003-07-09 00:00:21 | GET /news/science/cancer.shtml HTTP/1.0 | | 10 | 64.68.88.34 | 2003-07-09 00:02:46 | 2003-07-09 00:02:46 | GET /news/science/space.shtml HTTP/1.0 | +----------+-----------------+---------------------+---------------------+-------------------------------------------------+ 6 rows in set (0.00 sec) </code></pre>

Usually use of GROUP BY while listing a field in the select expression without an aggregate function is invalid SQL and should throw an error. MySQL, however, allows this and simply chooses one value randomly. Try to avoid it, because it is confusing. To disallow this, you can say at runtime: <code>SET sql_mode := CONCAT('ONLY_FULL_GROUP_BY,',@@sql_mode);</code> or use the configuration value and/or command line option <code>sql-mode</code>. Yes, listing two aggregate functions is completely valid.

MySQL: How does groupby work on columns without aggregate functions?

Tags:

mysql

group-by

I am somewhat confused about how the group by command works in mysql.

Suppose I have a table:

mysql> select recordID, IPAddress, date, httpMethod from Log_Analysis_Records_dalhousieShort;                   
+----------+-----------------+---------------------+-------------------------------------------------+
| recordID | IPAddress       | date                | httpMethod                                      |
+----------+-----------------+---------------------+-------------------------------------------------+
|        1 | 64.68.88.22     | 2003-07-09 00:00:21 | GET /news/science/cancer.shtml HTTP/1.0         | 
|        2 | 64.68.88.166    | 2003-07-09 00:00:55 | GET /news/internet/xml.shtml HTTP/1.0           | 
|        3 | 129.173.177.214 | 2003-07-09 00:01:23 | GET / HTTP/1.1                                  | 
|        4 | 129.173.177.214 | 2003-07-09 00:01:23 | GET /include/fcs_style.css HTTP/1.1             | 
|        5 | 129.173.177.214 | 2003-07-09 00:01:23 | GET /include/main_page.css HTTP/1.1             | 
|        6 | 129.173.177.214 | 2003-07-09 00:01:23 | GET /images/bigportaltopbanner.gif HTTP/1.1     | 
|        7 | 129.173.177.214 | 2003-07-09 00:01:23 | GET /images/right_1.jpg HTTP/1.1                | 
|        8 | 64.68.88.165    | 2003-07-09 00:02:43 | GET /studentservices/responsible.shtml HTTP/1.0 | 
|        9 | 64.68.88.165    | 2003-07-09 00:02:44 | GET /news/sports/basketball.shtml HTTP/1.0      | 
|       10 | 64.68.88.34     | 2003-07-09 00:02:46 | GET /news/science/space.shtml HTTP/1.0          | 
|       11 | 129.173.159.98  | 2003-07-09 00:03:46 | GET / HTTP/1.1                                  | 
|       12 | 129.173.159.98  | 2003-07-09 00:03:46 | GET /include/fcs_style.css HTTP/1.1             | 
|       13 | 129.173.159.98  | 2003-07-09 00:03:46 | GET /include/main_page.css HTTP/1.1             | 
|       14 | 129.173.159.98  | 2003-07-09 00:03:48 | GET /images/bigportaltopbanner.gif HTTP/1.1     | 
|       15 | 129.173.159.98  | 2003-07-09 00:03:48 | GET /images/left_1g.jpg HTTP/1.1                | 
|       16 | 129.173.159.98  | 2003-07-09 00:03:48 | GET /images/webcam.gif HTTP/1.1                 | 
+----------+-----------------+---------------------+-------------------------------------------------+

When I am execute this statement how does it choose which recordID to include since there are a range of recordIDs that would be correct? Does it just choose the first one that matches?

mysql> select recordID, IPAddress, date, httpMethod from Log_Analysis_Records_dalhousieShort GROUP BY IPADDRESS;
+----------+-----------------+---------------------+-------------------------------------------------+
| recordID | IPAddress       | date                | httpMethod                                      |
+----------+-----------------+---------------------+-------------------------------------------------+
|       11 | 129.173.159.98  | 2003-07-09 00:03:46 | GET / HTTP/1.1                                  | 
|        3 | 129.173.177.214 | 2003-07-09 00:01:23 | GET / HTTP/1.1                                  | 
|        8 | 64.68.88.165    | 2003-07-09 00:02:43 | GET /studentservices/responsible.shtml HTTP/1.0 | 
|        2 | 64.68.88.166    | 2003-07-09 00:00:55 | GET /news/internet/xml.shtml HTTP/1.0           | 
|        1 | 64.68.88.22     | 2003-07-09 00:00:21 | GET /news/science/cancer.shtml HTTP/1.0         | 
|       10 | 64.68.88.34     | 2003-07-09 00:02:46 | GET /news/science/space.shtml HTTP/1.0          | 
+----------+-----------------+---------------------+-------------------------------------------------+
6 rows in set (0.00 sec)

For this table the max(date) and min(date) values seem logical to me but I am confused about how the recordID and httpMethod where chosen.

Is it safe use two aggregate functions in one command?

mysql> select recordID, IPAddress, min(date), max(date), httpMethod from Log_Analysis_Records_dalhousieShort GROUP BY IPADDRESS;
+----------+-----------------+---------------------+---------------------+-------------------------------------------------+
| recordID | IPAddress       | min(date)           | max(date)           | httpMethod                                      |
+----------+-----------------+---------------------+---------------------+-------------------------------------------------+
|       11 | 129.173.159.98  | 2003-07-09 00:03:46 | 2003-07-09 00:03:48 | GET / HTTP/1.1                                  | 
|        3 | 129.173.177.214 | 2003-07-09 00:01:23 | 2003-07-09 00:01:23 | GET / HTTP/1.1                                  | 
|        8 | 64.68.88.165    | 2003-07-09 00:02:43 | 2003-07-09 00:02:44 | GET /studentservices/responsible.shtml HTTP/1.0 | 
|        2 | 64.68.88.166    | 2003-07-09 00:00:55 | 2003-07-09 00:00:55 | GET /news/internet/xml.shtml HTTP/1.0           | 
|        1 | 64.68.88.22     | 2003-07-09 00:00:21 | 2003-07-09 00:00:21 | GET /news/science/cancer.shtml HTTP/1.0         | 
|       10 | 64.68.88.34     | 2003-07-09 00:02:46 | 2003-07-09 00:02:46 | GET /news/science/space.shtml HTTP/1.0          | 
+----------+-----------------+---------------------+---------------------+-------------------------------------------------+
6 rows in set (0.00 sec)

300

asked Nov 14 '10 17:11

sixtyfootersdude

1 Answers

Usually use of GROUP BY while listing a field in the select expression without an aggregate function is invalid SQL and should throw an error.

MySQL, however, allows this and simply chooses one value randomly. Try to avoid it, because it is confusing.

To disallow this, you can say at runtime:

SET sql_mode := CONCAT('ONLY_FULL_GROUP_BY,',@@sql_mode);

or use the configuration value and/or command line option sql-mode.

Yes, listing two aggregate functions is completely valid.

answered Sep 19 '22 03:09

AndreKR

Related questions
                            
                                inserting data using mysql Connector in python
                            
                                How to have Unique IDs across two or more tables in MySQL?
                            
                                MYSQL installer for Windows 7 64 bit OS? [closed]
                            
                                Building Android app for running PHP and MySQL on Android Tablet
                            
                                Python Scrapy - populate start_urls from mysql
                            
                                Inserting MySQL results from PHP into JavaScript Array
                            
                                preparedStatement syntax error
                            
                                Prestashop redirects to old domain after changing it in Database
                            
                                Table doesn't exist After Creating a Temp Table
                            
                                mysql where string ends with numbers
                            
                                How to export Table to CSV along with column names in PHPMYADMIN
                            
                                It's not possible to remove indexes
                            
                                How do I prepend string to a field value in MySQL?
                            
                                The post-install step did not complete successfully MySQL Mac OS Sierra
                            
                                Wildcards in Java PreparedStatements
                            
                                CSV vs MySQL performance
                            
                                Suggestions for a user notification system in MySql and PHP
                            
                                "callback" from PayPal to confirm payment?
                            
                                Mantis - Integrate Wiki
                            
                                How to simulate "MYSQL has gone away" error?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With