Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

MySQL LEFT JOIN using MAX & GROUP BY on joined table?

Tags:

mysql

I've got two tables (members and activities) and I'm trying to query the members with the latest activity for each member. I've got it working with two queries (one to get the members and a second with a max(id) and group by(member) on the activities) and some code to merge the data. I'm SURE it can be done with a single query, but I can't quite work it out. Any ideas?

members table

id, name
 1, Shawn
 2, bob
 3, tom

activities table

id, member_id, code, timestamp, description
 1,         1,  123,     15000, baked a cake
 2,         1,  456,     20000, ate dinner
 3,         2,  789,     21000, drove home
 4,         1,  012,     22000, ate dessert

desired result:

id, name,  activity_code, activity_timestamp, activity_description
 1, shawn, 012,           22000,              ate dessert
 2, bob,   789,           21000,              drove home
 3, tom,   null,          null,               null
like image 853
Shawn McBride Avatar asked Dec 04 '12 06:12

Shawn McBride


3 Answers

The "latest per group" problem is extremely common in SQL. There are countless examples of solutions to this very problem on this site alone.

If your timestamps are uniqe per member activity:

SELECT
  m.id,
  m.name,
  a.code activity_code,
  a.timestamp activity_timestamp,
  a.description activity_description
FROM
  members m
  INNER JOIN activities a ON a.member_id = m.id
WHERE
  a.timestamp = (SELECT MAX(timestamp) FROM activities WHERE member_id = m.id)

alternatively, if your activity ID is increasing monotonically with time:

  ...
WHERE
  a.id = (SELECT MAX(id) FROM activities WHERE member_id = m.id)

You don't need to group. But the query will benefit from an index on activities over (member_id, timestamp) or (member_id, id), respectively.


EDIT

To show any members who have not logged an activity, use a left join like this.

SELECT
  m.id,
  m.name,
  a.code activity_code,
  a.timestamp activity_timestamp,
  a.description activity_description
FROM
  members m
  LEFT JOIN activities a ON 
    a.member_id = m.id
    AND a.timestamp = (SELECT MAX(timestamp) FROM activities WHERE member_id = m.id)

Note that there is no WHERE clause. Semantically, WHERE is applied after the joins are done. So a WHERE clause would remove the rows that the LEFT JOIN added, effectively giving in the same result as the original INNER JOIN.

But if you apply the additional predicate right in the join condition, the LEFT JOIN will work as expected.

like image 115
Tomalak Avatar answered Nov 05 '22 22:11

Tomalak


SELECT 
    members.id ,
    members.name,
    activities.code AS activity_code,
    activities.timestamp AS activity_timestamp,
    activities.description AS activity_description
FROM 
    members
    LEFT JOIN activities
        ON members.id = activities.member_id
    LEFT JOIN 
        (
            SELECT
                activities.member_id
                MAX(activities.id) AS id
            FROM activities
            GROUP BY 
                activities.member_id
        ) AS t1
        ON activities.id = t1.id
WHERE
    t1.id IS NOT NULL
like image 21
edze Avatar answered Nov 05 '22 22:11

edze


Select max(a.id), m.name, a.activity_code, a.activity_timestamp, a.activity_description
From members m
     Left join
     activities a on a.member_id=m.id
Group by  m.name, a.activity_code, a.activity_timestamp, a.activity_description
like image 3
Dale M Avatar answered Nov 05 '22 23:11

Dale M