Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do I join on "most recent" records?

I've got two tables in a SQL Server 2000 database joined by a parent child relationship. In the child database, the unique key is made up of the parent id and the datestamp.

I'm needing to do a join on these tables such that only the most recent entry for each child is joined.

Can anyone give me any hints how I can go about this?

like image 868
BenAlabaster Avatar asked Jun 22 '10 19:06

BenAlabaster


2 Answers

Here's the most optimized way I've found to do this. I tested it against several structures and this way had the lowest IO compared to other approaches.

This sample would get the last revision to an article

SELECT t.*
FROM ARTICLES AS t
    --Join the the most recent history entries
        INNER JOIN  REVISION lastHis ON t.ID = lastHis.FK_ID
        --limits to the last history in the WHERE statement
            LEFT JOIN REVISION his2 on lastHis.FK_ID = his2.FK_ID and lastHis.CREATED_TIME < his2.CREATED_TIME
WHERE his2.ID is null
like image 62
Laramie Avatar answered Oct 07 '22 15:10

Laramie


If you had a table which just contained the most recent entry for each parent, and the parent's id, then it would be easy, right?

You can make a table like that by joining the child table on itself, taking only the maximum datestamp for each parent id. Something like this (your SQL dialect may vary):

   SELECT t1.*
     FROM child AS t1
LEFT JOIN child AS t2
       ON (t1.parent_id = t2.parent_id and t1.datestamp < t2.datestamp)
    WHERE t2.datestamp IS NULL

That gets you all of the rows in the child table for which no higher timestamp exists, for that parent id. You can use that table in a subquery to join to:

   SELECT *
     FROM parent
     JOIN ( SELECT t1.*
              FROM child AS t1
         LEFT JOIN child AS t2
                ON (t1.parent_id = t2.parent_id and t1.datestamp < t2.datestamp)
             WHERE t2.datestamp IS NULL ) AS most_recent_children
       ON (parent.id = most_recent_children.parent_id

or join the parent table directly into it:

   SELECT parent.*, t1.*
     FROM parent
     JOIN child AS t1
       ON (parent.id = child.parent_id)
LEFT JOIN child AS t2
       ON (t1.parent_id = t2.parent_id and t1.datestamp < t2.datestamp)
    WHERE t2.datestamp IS NULL
like image 27
Ian Clelland Avatar answered Oct 07 '22 14:10

Ian Clelland