I have two tables, one stores the users, the other stores the users' email addresses.
userId
, username
, etc
)emailId
, userId
, email
)I would like to do a query that allows me to fetch the latest email address along with the user record.
I'm basically looking for a query that says
FIRST ORDER BY userEmail.emailId DESC
THEN GROUP BY userEmail.userId
This can be done with:
SELECT
users.userId
, users.username
, (
SELECT
userEmail.email
FROM userEmail
WHERE userEmail.userId = users.userId
ORDER BY userEmail.emailId DESC
LIMIT 1
) AS email
FROM users
ORDER BY users.username;
But this does a subquery for every row and is very inefficient. (It is faster to do 2 separate queries and 'join' them together in my program logic).
The intuitive query to write for what I want would be:
SELECT
users.userId
, users.username
, userEmail.email
FROM users
LEFT JOIN userEmail USING(userId)
GROUP BY users.userId
ORDER BY
userEmail.emailId
, users.username;
But, this does not function as I would like. (The GROUP BY
is performed before the sorting, so the ORDER BY userEmail.emailId
has nothing to do).
So my question is:
Is it possible to write the first query without making use of the subqueries?
I've searched and read the other questions on stackoverflow, but none seems to answer the question about this query pattern.
Using Group By and Order By TogetherThe GROUP BY clause is placed before the ORDER BY clause.
After Grouping the data, you can filter the grouped record using HAVING Clause. HAVING Clause returns the grouped records which match the given condition. You can also sort the grouped records using ORDER BY. ORDER BY used after GROUP BY on aggregated column.
group by does not order the data neccessarily. A DB is designed to grab the data as fast as possible and only sort if necessary. So add the order by if you need a guaranteed order.
To summarize, the key difference between order by and group by is: ORDER BY is used to sort a result by a list of columns or expressions. GROUP BY is used to create unique combinations of a list of columns that can be used to form summaries.
Group by statement is used to group the rows that have the same value. Whereas Order by statement sort the result-set either in ascending or in descending order.
But this does a subquery for every row and is very inefficient
Firstly, do you have a query plan / timings that demonstrate this? The way you've done it (with the subselect) is pretty much the 'intuitive' way to do it. Many DBMS (though I'm not sure about MySQL) have optimisations for this case, and will have a way to execute the query only once.
Alternatively, you should be able to create a subtable with ONLY (user id, latest email id)
tuples and JOIN
onto that:
SELECT
users.userId
, users.username
, userEmail.email
FROM users
INNER JOIN
(SELECT userId, MAX(emailId) AS latestEmailId
FROM userEmail GROUP BY userId)
AS latestEmails
ON (users.userId = latestEmails.userId)
INNER JOIN userEmail ON
(latestEmails.latestEmailId = userEmail.emailId)
ORDER BY users.username;
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With