Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to get SQL row by max of one column, group by another column

I need to read out users from an old database to use for statistics in a new system, but I do not have the original user table. However there is a statistics table with the sum of each year, where I also can find all needed user information. In addition this also only give me the users that were active, which is what I need.

The table has the following relevant columns: (the statistics columns are not relevant here)

  • UserID
  • FirstName
  • LastName
  • Email
  • Year

I want UserID to be distinct, so it is the only column that I can have in GROUP BY. I will run MAX on Year, to get the values from the most recent year. FirstName, LastName and Email need to be the same as the row where MAX(Year). In other words people may have changed both names and emails during the years, and I only want the last one, as it's the only one that is relevant.

My best suggestion for a SQL query goes like this:

SELECT UserID, Firstname, LastName, Email, MAX(Year) AS Year
FROM myTable
GROUP BY UserID
ORDER BY LastName, FirstName

The only problem is that SQL Server 2008 will not let me do anyhing like that, because all columns either has to be with a function like MAX or a part of GROUP BY. The columns FirstName, LastName and Email can't be under GROUP BY because that will make too many records. It seems somehow to work to put MAX on all of them, but then I have no way of knowing which column the MAX function is actually working on. I do not know for sure that it will be a problem, but I don't have time to look through 100 000 rows to see if there actually is a problem.

So in short I want the whole row of five columns where MAX works on only one column and GROUP BY on another. Does anybody have a good solution, or is it actually safe to use MAX on all non-grouping rows?

like image 959
bergenga Avatar asked Jan 19 '12 14:01

bergenga


People also ask

Can we use max with GROUP BY?

Max() function with Group by In this page we are discussing, how the GROUP BY clause along with the SQL MAX() can be used to find the maximum value of a column over each group.

How do I get the maximum value from multiple columns in SQL?

The MySQL Solution If you're working with MySQL, you can combine MAX() with the GREATEST() function to get the biggest value from two or more fields. Here's the syntax for GREATEST: GREATEST(value1,value2,...) Given two or more arguments, it returns the largest (maximum-valued) argument.

How do you SELECT the first value in a GROUP BY a bunch of rows?

To do that, you can use the ROW_NUMBER() function. In OVER() , you specify the groups into which the rows should be divided ( PARTITION BY ) and the order in which the numbers should be assigned to the rows ( ORDER BY ). You assign the row numbers within each group (i.e., year).

How does GROUP BY work with multiple columns in SQL?

The GROUP BY clause is used along with some aggregate functions to group columns with the same values in different rows. The group by multiple columns technique retrieves grouped column values from one or more database tables by considering more than one column as grouping criteria.


2 Answers

A few answers...


Correlated sub-query...

SELECT
  *
FROM
  myTable
WHERE
  Year = (SELECT MAX(Year) FROM myTable AS lookup WHERE lookup.UserID = myTable.UserID)


Join on derived aggregate...

SELECT
  *
FROM
  myTable
INNER JOIN
  (SELECT UserID, MAX(Year) AS Year FROM myTable GROUP BY UserID) AS lookup
    ON  lookup.UserID = myTable.UserID
    AND lookup.Year   = myTable.Year


Ordered CTE using ROW_NUMBER()...

WITH
  sequenced_data AS
(
  SELECT
    ROW_NUMBER() OVER (PARTITION BY UserID ORDER BY Year DESC) AS sequence_id,
    *
  FROM
    myTable
)
SELECT
  *
FROM
  sequenced_data
WHERE
  sequence_id = 1
like image 100
MatBailie Avatar answered Oct 21 '22 12:10

MatBailie


Dy you have only one year record per user? If yes, then you can use old'n'good join:

SELECT m.UserID, m.Firstname, m.LastName, m.Email, m.Year
FROM myTable m
    INNER JOIN (
        SELECT UserID, MAX(Year) as Year
        FROM myTable
        GROUP BY UserID
    ) x ON x.UserID=m.UserID and x.Year=m.Year
ORDER BY m.LastName, m.FirstName

Sure you can use constructs from newer SQL versions, I just have got used to older (=more generic) possibilities :).

like image 24
Arvo Avatar answered Oct 21 '22 12:10

Arvo