Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to remove duplicates from an Oracle result set based on multiple tables

I have 2 tables: a main APPLICATION table that holds the core data, and a STATUSTRACKING table that reflects the status changes of the core data in the APPLICATION table.

There is a 1:N relationship between the APPLICATION table and the STATUSTRACKING table. Examples follow:

-- APPLICATION
SELECT A.ID, A.DECISIONON, A.APPLICATIONSTATUS_ID AS CURRENTSTATUS
FROM APPLICATION A
WHERE A.ID=1099;

Results:
ID      DECISIONON  CURRENTSTATUS
1099    5/05/2009   5

-- STATUS TRACKING
SELECT ST.ID, ST.APPLICATION_ID AS APP_ID, ST.APPLICATIONSTATUS_ID AS STATUS, ST.CREATEDATE 
FROM STATUSTRACKING ST
WHERE ST.APPLICATION_ID=1099
ORDER BY ST.CREATEDATE DESC;

Results:
ID      APP_ID  STATUS  CREATEDATE
44466   1099    5       5/05/2009
44458   1099    7       5/05/2009
10826   1099    8       21/07/2008
9770    1099    7       9/07/2008
4410    1099    3       9/05/2008
3814    1099    2       2/05/2008
3803    1099    1       2/05/2008

Problem: I need to select the STATUSTRACKING record with the MOST RECENT DATE for a particular status. This is my attempt:

SELECT A.ID AS APP_ID, A.APPLICATIONSTATUS_ID AS CURR_ST, Z.ID AS ST_ID, Z.CREATEDATE, Z.APPLICATIONSTATUS_ID AS OLD_ST
FROM APPLICATION A, STATUSTRACKING Z
WHERE A.APPLICATIONSTATUS_ID in (5,6)
    AND A.ID IN (337,1099,1404,9441)
    AND Z.APPLICATIONSTATUS_ID = 7
    AND Z.APPLICATION_ID=A.ID
ORDER BY A.ID ASC, Z.CREATEDATE DESC;

Results:
APP_ID  CURR_ST ST_ID       CREATEDATE  OLD_ST
337     6       13978       17/08/2008  7
1099    5       44458       5/05/2009   7
1099    5       9770        9/07/2008   7
1404    6       15550       28/08/2008  7
9441    5       49271       3/06/2009   7
9441    5       46058       13/05/2009  7

The problem is the duplicated rows. I must only show the row with the MOST RECENT CreateDate. In the case of App_ID 1099, it would be this record:

1099    5       44458       5/05/2009   7

I obviously don't want to exclude the rows that are not duplicated.

I thought I was on the right track with this statement which gives the row I'm looking for:

SELECT A.ID, A.APPLICATION_ID, A.APPLICATIONSTATUS_ID, A.CREATEDATE
FROM    (SELECT * 
            FROM STATUSTRACKING 
            WHERE APPLICATIONSTATUS_ID=7 
                AND APPLICATION_ID=1099 
            ORDER BY CREATEDATE DESC) A
WHERE ROWNUM = 1;

... but I can't seem to get it to work with my main select statement.

The result set I want should look like this:

APP_ID  CURR_ST ST_ID       CREATEDATE  OLD_ST
337     6       13978       17/08/2008  7
1099    5       44458       5/05/2009   7
1404    6       15550       28/08/2008  7
9441    5       49271       3/06/2009   7
etc. ...

I'm no Oracle/SQL expert, so any help would be appreciated.

like image 640
user1058946 Avatar asked Dec 04 '25 18:12

user1058946


1 Answers

The easiest approach is probably to use an analytic function. Something like

SELECT *
  FROM (
    SELECT A.ID AS APP_ID, 
           A.APPLICATIONSTATUS_ID AS CURR_ST, 
           Z.ID AS ST_ID, 
           Z.CREATEDATE, 
           Z.APPLICATIONSTATUS_ID AS OLD_ST,
           rank() over (partition by a.id order by z.createDate desc) rnk
      FROM APPLICATION A, 
           STATUSTRACKING Z
     WHERE A.APPLICATIONSTATUS_ID in (5,6)
       AND A.ID IN (337,1099,1404,9441)
       AND Z.APPLICATIONSTATUS_ID = 7
       AND Z.APPLICATION_ID=A.ID
    )
 WHERE rnk = 1

If there can be ties, you may want to use the ROW_NUMBER or the DENSE_RANK analytic function rather than RANK.

like image 98
Justin Cave Avatar answered Dec 06 '25 07:12

Justin Cave



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!