Suppose I have this table tab
(fiddle available).
| g | a | b | v |
---------------------
| 1 | 3 | 5 | foo |
| 1 | 4 | 7 | bar |
| 1 | 2 | 9 | baz |
| 2 | 1 | 1 | dog |
| 2 | 5 | 2 | cat |
| 2 | 5 | 3 | horse |
| 2 | 3 | 8 | pig |
I'm grouping rows by g
, and for each group I want one value from column v
. However, I don't want any value, but I want the value from the row with maximal a
, and from all of those, the one with maximal b
. In other words, my result should be
| 1 | bar |
| 2 | horse |
I know of a query to achieve this:
SELECT grps.g,
(SELECT v FROM tab
WHERE g = grps.g
ORDER BY a DESC, b DESC
LIMIT 1) AS r
FROM (SELECT DISTINCT g FROM tab) grps
But I consider this query rather ugly. Mostly because it uses a dependant subquery, which feels like a real performance killer. So I wonder whether there is an easier solution to this problem.
The most likely answer I expect to this question would be some kind of add-on or patch for MySQL (or MariaDB) which does provide a feature for this. But I'll welcome other useful inspirations as well. Anything which works without a dependent subquery would qualify as an answer.
If your solution only works for a single ordering column, i.e. couldn't distinguish between cat
and horse
, feel free to suggest that answer as well as I expect it to be still useful to the majority of use cases. For example, 100*a+b
would be a likely way to order the above data by both columns while still using only a single expression.
I have a few pretty hackish solutions in mind, and might add them after a while, but I'll first look and see whether some nice new ones pour in first.
As it is pretty hard to compare the various answers just by looking at them, I've run some benchmarks on them. This was run on my own desktop, using MySQL 5.1. The numbers won't compare to any other system, only to one another. You probably should be doing your own tests with your real-life data if performance is crucial to your application. When new answers come in, I might add them to my script, and re-run all the tests.
So it seems that my own solution so far isn't all that bad, even with the dependent subquery. Surprisingly, the solution by acatt, which uses a dependent subquery as well and which I therefore would have considered about the same, performs much worse. Probably something the MySQL optimizer can't cope with. The solution RichardTheKiwi proposed seems to have good overall performance as well. The other two solutions heavily depend on the structure of the data. With many groups small groups, xdazz' approach outperforms all other, whereas the solution by Dems performs best (though still not exceptionally good) for few large groups.
Using Group By and Order By Together When combining the Group By and Order By clauses, it is important to bear in mind that, in terms of placement within a SELECT statement: The GROUP BY clause is placed after the WHERE clause. The GROUP BY clause is placed before the ORDER BY clause.
Cannot use an aggregate or a subquery in an expression used for the group by list of a GROUP BY clause. The original idea was to create the table in beginning of the query, so the (SELECT * FROM #TBL) could be used on the query itself, instead of defining the names on each GROUP BY.
In the query, GROUP BY clause is placed after the WHERE clause. In the query, GROUP BY clause is placed before ORDER BY clause if used any.
Order By and Group By Clause in SQLGroup By in SQL is used to arrange similar data into groups and Order By in SQL is used to sort the data in ascending or descending order.
SELECT g, a, b, v
FROM (
SELECT *,
@rn := IF(g = @g, @rn + 1, 1) rn,
@g := g
FROM (select @g := null, @rn := 0) x,
tab
ORDER BY g, a desc, b desc, v
) X
WHERE rn = 1;
Single pass. All the other solutions look O(n^2) to me.
This way doesn't use sub-query.
SELECT t1.g, t1.v
FROM tab t1
LEFT JOIN tab t2 ON t1.g = t2.g AND (t1.a < t2.a OR (t1.a = t2.a AND t1.b < t2.b))
WHERE t2.g IS NULL
Explanation:
The LEFT JOIN works on the basis that when t1.a is at its maximum value, there is no s2.a with a greater value and the s2 rows values will be NULL.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With