Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Oracle joins ( left outer, right, etc. :S )

I knew stackoverflow would help me for other than know what is the "favorite programming cartoon" :P

This was the accepted answer by: Bill Karwin

Thanks to all for the help ( I would like to double vote you all )

My query ended up like this ( this is the real one )

SELECT 
    accepted.folio,
    COALESCE( inprog.activityin, accepted.activityin ) as activityin,
    inprog.participantin,
    accepted.completiondate 
FROM performance accepted
    LEFT OUTER JOIN performance inprog 
        ON( accepted.folio = inprog.folio 
            AND inprog.ACTIVITYIN 
            IN ( 4, 435 )                    -- both are ids for inprogress
            AND inprog.PARTICIPANTIN != 1  ) -- Ignore the "bot" participant
    LEFT OUTER JOIN performance closed
        ON( accepted.folio = closed.folio 
            AND closed.ACTIVITYIN IN ( 10,436, 4, 430  ) )  -- all these are closed or cancelled
WHERE accepted.ACTIVITYIN IN ( 3, 429 )      --- both are id for new 
AND accepted.folio IS NOT NULL
AND closed.folio IS NULL;

Now I just have to join with the other tables for a human readable report.


ORIGINAL POST

Hello.

I'm struggling for about 6 hrs. now with a DB query ( my long time nemesis )

I have a data table with some fields like:

table performance( 
     identifier varchar, 
     activity    number, 
     participant number, 
     closedate   date, 
)

It is used to keep track of the history of ticket

Identifier: is a customer id like ( NAF0000001 )

activity: is a fk of where the ticket is ( new, in_progress, rejected, closed, etc )

participant: is a fk of who is attending at that point the ticket

closedate: is the date when that activity finished.

EDIT: I should have said "completiondate" rather than closedate. This is the date when the activity was completed, not necessary when the ticket was closed.

For instance a typical history may be like this:

identifier|activity|participant|closedate
-------------------------------------------
NA00000001|       1|          1|2008/10/08 15:00|
-------------------------------------------
NA00000001|       2|          2|2008/10/08 15:20|
-------------------------------------------
NA00000001|       3|          2|2008/10/08 15:40|
-------------------------------------------
NA00000001|       4|         4|2008/10/08 17:05|
-------------------------------------------

And participant 1=jonh, 2=scott, 3=mike, 4=rob

and activties 1=new, 2=inprogress, 3=waitingforapproval, 4=closed

etc. And tens of other irrelevant info.

Well my problem is the following.

I have managed to create a query where I can know when a ticket was opened and closed

it is like this:

 select 
     a.identifier,
     a.participant,
     a.closedate as start,
     b.closedate as finish      
from 
    performance a,
    performance b
where
    a.activity = 1 -- new
    and b.activity = 4 -- closed
    and a.identifier = b.identifier

But I can't know what tickets are not closed and who is attending them.

So far I have something like this:

 select 
     a.identifier,
     a.participant,
     a.closedate as start
from 
    performance a        
where
    a.activity = 1 -- new
    and a.identifier not in ( select identifier from performance where activity = 4 ) --closed

That is give me all the ones who have an start ( new = 1 ) but are not closed ( closed = 4 )

But the big problem here is that it prints the participant who opened the ticket, but I need the participant who is attending it. So I add the "inprogress" activity to the query.

 select 
     a.identifier,
     a.participant,
     a.closedate as start
from 
    performance a,
    performance b        
where
    a.activity = 1 -- new        
    and a.identifier not in ( select identifier from performance where activity = 4 ) --closed
    and b.identifier = a.identifier
    and b.activity = 2  -- inprogress..

But not all the rows that are in "new" are "inprogress" and with that query I drop all of them.

What I need is to show all the "inprogress" participant and if the ticket is not "inprogress", it will show as empty.

Somthing like

    identifier|activity|participant|closedate
-------------------------------------------
NA00000002|       1|           |2008/10/08 15:00|
-------------------------------------------
NA00000003|       1|           |2008/10/08 15:20|
-------------------------------------------
NA00000004|       1|           |2008/10/08 15:40|
-------------------------------------------
NA00000005|       2|          4|2008/10/08 15:40|
-------------------------------------------
NA00000006|       2|          4|2008/10/08 15:40|

In this case

NA002, NA003 and NA004 are in "new", so no participant is shown

While

NA005 and NA006 are being "inprgress (act = 2 )" and they are being attended by rob ( participant 4 )

So I remember there was this thing called left outer join or something like that but I never ever understand it. What I would like to know is how can I fetch the identifiers that are "inprogress" and "new" and that are not closed.

Probably taking a little rest would help me to clear my mind. If anyone knows how to do it I'll appreciate it.

By the way I've tried:

 select 
     a.identifier,
     a.participant,
     a.closedate as start
from 
    performance a
    left outer join
    performance b  
    on      
    b.identifier = a.identifier
where
    a.activity = 1 -- new        
    and a.identifier not in ( select identifier from performance where activity = 4 ) --closed
    and b.activity = 2  -- inprogress..

But gives me the same result as the previous ( drop the only in "new" records )

like image 814
OscarRyz Avatar asked Dec 30 '22 09:12

OscarRyz


2 Answers

Try something like this (I haven't tested it):

SELECT p_new.identifier, COALESCE(p_inprog.activity, p_new.activity) AS activity,
  p_inprog.participant, COALESCE(p_inprog.closedate, p_new.closedate) AS closedate
FROM performance p_new
  LEFT OUTER JOIN performance p_inprog 
    ON (p_new.identifier = p_inprog.identifier AND p_inprog.activity = 2)
  LEFT OUTER JOIN performance p_closed 
    ON (p_new.identifier = p_closed.identifier AND p_closed.activity = 4)
WHERE p_new.activity = 1
  AND p_closed.identifier IS NULL;

I think people believe outer joins are harder than they really are. For example:

A LEFT OUTER JOIN B ON (...condition...)

This returns all rows from A, whether or not there are any matching rows in B. If no rows in B match, treat all columns B.* as NULL in the result set for that row of A. The join condition can be an expression that the row in B must satisfy, or else it isn't included in the join. So, more rows in A will be solo.

like image 194
Bill Karwin Avatar answered Jan 01 '23 22:01

Bill Karwin


Typically the better way to write those is with EXISTS. The first one would be:

select * from performance p1
where not exists 
    ( select * from performance p2 
      where p2.identifier = p1.identifier and p2.activity = 4 )

This way lets you do a keyed lookup on performance.identifier, rather than potentially having to build a massive list of identifiers in (select identifier from performance where activity=4).

like image 45
Andy Lester Avatar answered Jan 01 '23 23:01

Andy Lester