I want to be able to select a bunch of rows from a table of e-mails and group them by the from sender. My query looks like this: <pre class="prettyprint"><code>SELECT `timestamp`, `fromEmail`, `subject` FROM `incomingEmails` GROUP BY LOWER(`fromEmail`) ORDER BY `timestamp` DESC </code></pre> The query almost works as I want it — it selects records grouped by e-mail. The problem is that the subject and timestamp don't correspond to the most recent record for a particular e-mail address. For example, it might return: <pre class="prettyprint"><code>fromEmail: john@example.com, subject: hello fromEmail: mark@example.com, subject: welcome </code></pre> When the records in the database are: <pre class="prettyprint"><code>fromEmail: john@example.com, subject: hello fromEmail: john@example.com, subject: programming question fromEmail: mark@example.com, subject: welcome </code></pre> If the "programming question" subject is the most recent, how can I get MySQL to select that record when grouping the e-mails?

Do a GROUP BY after the ORDER BY by wrapping your query with the GROUP BY like this: <pre class="prettyprint"><code>SELECT t.* FROM (SELECT * FROM table ORDER BY time DESC) t GROUP BY t.from </code></pre>

MySQL "Group By" and "Order By"

Tags:

sql

mysql

sql-order-by

group-by

aggregate-functions

I want to be able to select a bunch of rows from a table of e-mails and group them by the from sender. My query looks like this:

SELECT 
    `timestamp`, `fromEmail`, `subject`
FROM `incomingEmails` 
GROUP BY LOWER(`fromEmail`) 
ORDER BY `timestamp` DESC

The query almost works as I want it — it selects records grouped by e-mail. The problem is that the subject and timestamp don't correspond to the most recent record for a particular e-mail address.

For example, it might return:

fromEmail: [email protected], subject: hello
fromEmail: [email protected], subject: welcome

When the records in the database are:

fromEmail: [email protected], subject: hello
fromEmail: [email protected], subject: programming question
fromEmail: [email protected], subject: welcome

If the "programming question" subject is the most recent, how can I get MySQL to select that record when grouping the e-mails?

418

asked Jun 30 '09 22:06

John Kurlak

6 Answers

A simple solution is to wrap the query into a subselect with the ORDER statement first and applying the GROUP BY later:

SELECT * FROM ( 
    SELECT `timestamp`, `fromEmail`, `subject`
    FROM `incomingEmails` 
    ORDER BY `timestamp` DESC
) AS tmp_table GROUP BY LOWER(`fromEmail`)

This is similar to using the join but looks much nicer.

Using non-aggregate columns in a SELECT with a GROUP BY clause is non-standard. MySQL will generally return the values of the first row it finds and discard the rest. Any ORDER BY clauses will only apply to the returned column value, not to the discarded ones.

IMPORTANT UPDATE Selecting non-aggregate columns used to work in practice but should not be relied upon. Per the MySQL documentation "this is useful primarily when all values in each nonaggregated column not named in the GROUP BY are the same for each group. The server is free to choose any value from each group, so unless they are the same, the values chosen are indeterminate."

As of 5.7.5 ONLY_FULL_GROUP_BY is enabled by default so non-aggregate columns cause query errors (ER_WRONG_FIELD_WITH_GROUP)

As @mikep points out below the solution is to use ANY_VALUE() from 5.7 and above

See http://www.cafewebmaster.com/mysql-order-sort-group https://dev.mysql.com/doc/refman/5.6/en/group-by-handling.html https://dev.mysql.com/doc/refman/5.7/en/group-by-handling.html https://dev.mysql.com/doc/refman/5.7/en/miscellaneous-functions.html#function_any-value

answered Oct 01 '22 17:10

b7kich

As pointed in a reply already, the current answer is wrong, because the GROUP BY arbitrarily selects the record from the window.

If one is using MySQL 5.6, or MySQL 5.7 with ONLY_FULL_GROUP_BY, the correct (deterministic) query is:

SELECT incomingEmails.*
  FROM (
    SELECT fromEmail, MAX(timestamp) `timestamp`
    FROM incomingEmails
    GROUP BY fromEmail
  ) filtered_incomingEmails
  JOIN incomingEmails USING (fromEmail, timestamp)
GROUP BY fromEmail, timestamp

In order for the query to run efficiently, proper indexing is required.

Note that for simplification purposes, I've removed the LOWER(), which in most cases, won't be used.

answered Oct 01 '22 18:10

Marcus

Here's one approach:

SELECT cur.textID, cur.fromEmail, cur.subject, 
     cur.timestamp, cur.read
FROM incomingEmails cur
LEFT JOIN incomingEmails next
    on cur.fromEmail = next.fromEmail
    and cur.timestamp < next.timestamp
WHERE next.timestamp is null
and cur.toUserID = '$userID' 
ORDER BY LOWER(cur.fromEmail)

Basically, you join the table on itself, searching for later rows. In the where clause you state that there cannot be later rows. This gives you only the latest row.

If there can be multiple emails with the same timestamp, this query would need refining. If there's an incremental ID column in the email table, change the JOIN like:

LEFT JOIN incomingEmails next
    on cur.fromEmail = next.fromEmail
    and cur.id < next.id

answered Oct 01 '22 18:10

Andomar

Do a GROUP BY after the ORDER BY by wrapping your query with the GROUP BY like this:

SELECT t.* FROM (SELECT * FROM table ORDER BY time DESC) t GROUP BY t.from

answered Oct 01 '22 16:10

11101101b

According to SQL standard you cannot use non-aggregate columns in select list. MySQL allows such usage (uless ONLY_FULL_GROUP_BY mode used) but result is not predictable.

ONLY_FULL_GROUP_BY

You should first select fromEmail, MIN(read), and then, with second query (or subquery) - Subject.

answered Oct 01 '22 16:10

noonex

I struggled with both these approaches for more complex queries than those shown, because the subquery approach was horribly ineficient no matter what indexes I put on, and because I couldn't get the outer self-join through Hibernate

The best (and easiest) way to do this is to group by something which is constructed to contain a concatenation of the fields you require and then to pull them out using expressions in the SELECT clause. If you need to do a MAX() make sure that the field you want to MAX() over is always at the most significant end of the concatenated entity.

The key to understanding this is that the query can only make sense if these other fields are invariant for any entity which satisfies the Max(), so in terms of the sort the other pieces of the concatenation can be ignored. It explains how to do this at the very bottom of this link. http://dev.mysql.com/doc/refman/5.0/en/group-by-hidden-columns.html

If you can get am insert/update event (like a trigger) to pre-compute the concatenation of the fields you can index it and the query will be as fast as if the group by was over just the field you actually wanted to MAX(). You can even use it to get the maximum of multiple fields. I use it to do queries against multi-dimensional trees expresssed as nested sets.

answered Oct 01 '22 16:10

Mike N

Related questions
                            
                                MYSQL Dump only certain rows
                            
                                MySQL syntax for Join Update
                            
                                Select where count of one field is greater than one
                            
                                SQL set values of one column equal to values of another column in the same table
                            
                                How to create relationships in MySQL
                            
                                nodejs mysql Error: Connection lost The server closed the connection
                            
                                Insert current date in datetime format mySQL
                            
                                MySQL count occurrences greater than 2
                            
                                mysql create user if not exists
                            
                                the MySQL service on local computer started and then stopped
                            
                                How to store Java Date to Mysql datetime with JPA
                            
                                MySql: is it possible to 'SUM IF' or to 'COUNT IF'?
                            
                                Laravel: Get Object From Collection By Attribute
                            
                                setting global sql_mode in mysql
                            
                                MySQL "WITH" clause
                            
                                MySQL - Operand should contain 1 column(s)
                            
                                PDO support for multiple queries (PDO_MYSQL, PDO_MYSQLND)
                            
                                Commands out of sync; you can't run this command now
                            
                                How to select a column name with a space in MySQL [duplicate]
                            
                                MySql - Way to update portion of a string?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

MySQL "Group By" and "Order By"

Tags:

sql

mysql

sql-order-by

group-by

aggregate-functions

John Kurlak

People also ask

6 Answers

b7kich

Marcus

Andomar

11101101b

noonex

Mike N

Recent Activity

Donate For Us