Question

I knew stackoverflow would help me for other than know what is the "favorite programming cartoon" :P

This was the accepted answer by: Bill Karwin

Thanks to all for the help ( I would like to double vote you all )

My query ended up like this ( this is the real one )

SELECT 
    accepted.folio,
    COALESCE( inprog.activityin, accepted.activityin ) as activityin,
    inprog.participantin,
    accepted.completiondate 
FROM performance accepted
    LEFT OUTER JOIN performance inprog 
        ON( accepted.folio = inprog.folio 
            AND inprog.ACTIVITYIN 
            IN ( 4, 435 )                    -- both are ids for inprogress
            AND inprog.PARTICIPANTIN != 1  ) -- Ignore the "bot" participant
    LEFT OUTER JOIN performance closed
        ON( accepted.folio = closed.folio 
            AND closed.ACTIVITYIN IN ( 10,436, 4, 430  ) )  -- all these are closed or cancelled
WHERE accepted.ACTIVITYIN IN ( 3, 429 )      --- both are id for new 
AND accepted.folio IS NOT NULL
AND closed.folio IS NULL;

Now I just have to join with the other tables for a human readable report.


ORIGINAL POST

Hello.

I'm struggling for about 6 hrs. now with a DB query ( my long time nemesis )

I have a data table with some fields like:

table performance( 
     identifier varchar, 
     activity    number, 
     participant number, 
     closedate   date, 
)

It is used to keep track of the history of ticket

Identifier: is a customer id like ( NAF0000001 )

activity: is a fk of where the ticket is ( new, in_progress, rejected, closed, etc )

participant: is a fk of who is attending at that point the ticket

closedate: is the date when that activity finished.

EDIT: I should have said "completiondate" rather than closedate. This is the date when the activity was completed, not necessary when the ticket was closed.

For instance a typical history may be like this:

identifier|activity|participant|closedate
-------------------------------------------
NA00000001|       1|          1|2008/10/08 15:00|
-------------------------------------------
NA00000001|       2|          2|2008/10/08 15:20|
-------------------------------------------
NA00000001|       3|          2|2008/10/08 15:40|
-------------------------------------------
NA00000001|       4|         4|2008/10/08 17:05|
-------------------------------------------

And participant 1=jonh, 2=scott, 3=mike, 4=rob

and activties 1=new, 2=inprogress, 3=waitingforapproval, 4=closed

etc. And tens of other irrelevant info.

Well my problem is the following.

I have managed to create a query where I can know when a ticket was opened and closed

it is like this:

 select 
     a.identifier,
     a.participant,
     a.closedate as start,
     b.closedate as finish      
from 
    performance a,
    performance b
where
    a.activity = 1 -- new
    and b.activity = 4 -- closed
    and a.identifier = b.identifier

But I can't know what tickets are not closed and who is attending them.

So far I have something like this:

 select 
     a.identifier,
     a.participant,
     a.closedate as start
from 
    performance a        
where
    a.activity = 1 -- new
    and a.identifier not in ( select identifier from performance where activity = 4 ) --closed

That is give me all the ones who have an start ( new = 1 ) but are not closed ( closed = 4 )

But the big problem here is that it prints the participant who opened the ticket, but I need the participant who is attending it. So I add the "inprogress" activity to the query.

 select 
     a.identifier,
     a.participant,
     a.closedate as start
from 
    performance a,
    performance b        
where
    a.activity = 1 -- new        
    and a.identifier not in ( select identifier from performance where activity = 4 ) --closed
    and b.identifier = a.identifier
    and b.activity = 2  -- inprogress..

But not all the rows that are in "new" are "inprogress" and with that query I drop all of them.

What I need is to show all the "inprogress" participant and if the ticket is not "inprogress", it will show as empty.

Somthing like

    identifier|activity|participant|closedate
-------------------------------------------
NA00000002|       1|           |2008/10/08 15:00|
-------------------------------------------
NA00000003|       1|           |2008/10/08 15:20|
-------------------------------------------
NA00000004|       1|           |2008/10/08 15:40|
-------------------------------------------
NA00000005|       2|          4|2008/10/08 15:40|
-------------------------------------------
NA00000006|       2|          4|2008/10/08 15:40|

In this case

NA002, NA003 and NA004 are in "new", so no participant is shown

While

NA005 and NA006 are being "inprgress (act = 2 )" and they are being attended by rob ( participant 4 )

So I remember there was this thing called left outer join or something like that but I never ever understand it. What I would like to know is how can I fetch the identifiers that are "inprogress" and "new" and that are not closed.

Probably taking a little rest would help me to clear my mind. If anyone knows how to do it I'll appreciate it.

By the way I've tried:

 select 
     a.identifier,
     a.participant,
     a.closedate as start
from 
    performance a
    left outer join
    performance b  
    on      
    b.identifier = a.identifier
where
    a.activity = 1 -- new        
    and a.identifier not in ( select identifier from performance where activity = 4 ) --closed
    and b.activity = 2  -- inprogress..

But gives me the same result as the previous ( drop the only in "new" records )

Was it helpful?

Solution

Try something like this (I haven't tested it):

SELECT p_new.identifier, COALESCE(p_inprog.activity, p_new.activity) AS activity,
  p_inprog.participant, COALESCE(p_inprog.closedate, p_new.closedate) AS closedate
FROM performance p_new
  LEFT OUTER JOIN performance p_inprog 
    ON (p_new.identifier = p_inprog.identifier AND p_inprog.activity = 2)
  LEFT OUTER JOIN performance p_closed 
    ON (p_new.identifier = p_closed.identifier AND p_closed.activity = 4)
WHERE p_new.activity = 1
  AND p_closed.identifier IS NULL;

I think people believe outer joins are harder than they really are. For example:

A LEFT OUTER JOIN B ON (...condition...)

This returns all rows from A, whether or not there are any matching rows in B. If no rows in B match, treat all columns B.* as NULL in the result set for that row of A. The join condition can be an expression that the row in B must satisfy, or else it isn't included in the join. So, more rows in A will be solo.

OTHER TIPS

Typically the better way to write those is with EXISTS. The first one would be:

select * from performance p1
where not exists 
    ( select * from performance p2 
      where p2.identifier = p1.identifier and p2.activity = 4 )

This way lets you do a keyed lookup on performance.identifier, rather than potentially having to build a massive list of identifiers in (select identifier from performance where activity=4).

I think this should do it.

The first part gets all records that are new, not closed and not in progress. The second part gets all in progress records. We then join them together, we can also sort by identifier by wrapping a 'SELECT * FROM' around this query.

select 
  a.identifier,
  a.participant,
  a.closedate as start
from 
  performance a
where
  a.activity = 1
  and not exists ( select identifier 
                   from performance b 
                   where b.activity = 4 
                   and b.identifier = a.identifier) 
  and not exists ( select identifier 
                   from performance c 
                   where c.activity = 2 
                   and c.identifier = a.identifier) 
UNION ALL
select 
  a.identifier,
  a.participant,
  a.closedate as start
from 
  performance a
where
  a.activity = 2
  and not exists ( select identifier 
                   from performance b 
                   where b.activity = 4 
                   and b.identifier = a.identifier); 

I would suggest that what you want is the earliest record (presumably, but not necessarily the one with activity=1) and the most recent record (regardless of activity number). If the activity of the most recent record is 4 then the ticket is closed. otherwise, the participant is the current holder of the ticket. There is a potential bug introduced by just matching on activity = 4 if the ticket can be re-opened.

Actually, based upon your example, you may not even need the earliest record. How about the following:

SELECT
        identifier,
        activity,
        participant,
        closedate
    FROM
        performance a
    WHERE
        (a.identifier, a.closedate) in
            (select b.identifier, max(b.closedate)
                from performance b
                group by b.identifier
            )
;

How about this:

SELECT * FROM (
  SELECT identifier,
         MAX(activity) activity,
         MAX(participant) KEEP (DENSE_RANK LAST ORDER BY activity)
    FROM performance
    GROUP BY identifier
)
WHERE activity in (1,2)

The inner query gives the latest activity for each ticket and its corresponding participant. The outer query filters this down to the ones where the activity is either "new" or "in progress".

I love the DENSE_RANK functions.

Firstly, you may have a design issue if you can have a customer with multiple tickets open at the same time. You should ideally have a ticket_id, and then you can perform Andy's query by using ticket_id instead of identifier.

What tickets are not closed:

select identifier as closed_identifier 
  from performance where identifier not exists
  (select identifier from performance where activity=4)

Tickets that are being attended:

select identifier as inprogress_identifier, participant performance 
  from performance where activity=2

Unclosed tickets, with the participant of that are being attended:

select * from 
  (select identifier as notclosed_identifier 
    from performance where identifier not exists
    (select identifier from performance where activity=4)) closed 
left join 
  (select identifier as inprogress_identifier, participant performance 
    from performance where activity=2) attended 
on notclosed_identifier=inprogress_identifier

May be you can use this kind of query as a starting point.

select x.identifier, 
       max(x.p_1) as new_participant, max(x.c_1) as new_date,
       max(x.p_2) as inprogress_participant, max(x.c_2) as inprogress_date,
       max(x.p_3) as approval_participant, max(x.c_3) as approval_date,
       max(x.p_4) as closing_participant, max(x.c_4) as closing_date
  from (
        select a.identifier, 
               decode (activity, 1, participant, null) as p_1,  decode (activity, 1, closedate, null) as c_1,
               decode (activity, 2, participant, null) as p_2,  decode (activity, 2, closedate, null) as c_2,
               decode (activity, 3, participant, null) as p_3,  decode (activity, 3, closedate, null) as c_3,
               decode (activity, 4, participant, null) as p_4,  decode (activity, 4, closedate, null) as c_4
          from performance a
        ) x
 group by x.identifier

The idea is to serialize your table from row into field, and create a view based on it. You can create report based on this view.

Regards,

Just a quick idea that others might build on (untested, but I hope the idea comes across):

First, select all not yet closed activities (as posted by others):

select id
from performance p1 where identifier not exists
  (select * from performance p2 where activity=4 and p1.id=p2.id)

Then, you can add the person attending the activity by adding a subquery in the select clause:

select id,
 (select participant 
  from performance p3 
  where p3.activity=3 and p1.id=p2.id)
from performance p1 where identifier not exists
  (select * from performance p2 where activity=4 and p1.id=p2.id)

If there is no activity 3 record for this id, the subquery returns null which is exactly what we need.

Hope this helps - please expand if necessary.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top