Question

This is very similar to a question I have already asked (and had answered) here: Identifying start and end of period covered - oracle 10g sql

However, I've not been able to work out how to achieve what I need in this case. I have data that looks like this:

ID  Start            End    End Code        Worker ID
A   02/08/2003  23/01/2007  A               1
A   24/01/2007  17/11/2008  J               2
A   03/03/2009  20/10/2009  A               3
A   21/10/2009  08/03/2010  A               4
A   09/03/2010  29/07/2010  A               5
A   30/07/2010                              6

The end code 'A' means that a case has been reassigned, 'J' means that it has been closed. I would like data that looks like the following:

     ID Start             End   Worker IDs  End Worker
     A  02/08/2003  17/11/2008  1,2         2
     A  03/03/2009              3,4,5,6     6

As I've already said, this is very similar to a previous question I've asked, but in that case there was a start code I could utilise. I think I need to identify the beginning of the assignment- it's either the first date that follows an assignment with a 'J' code, or just the earliest date but I'm struggling a bit. Any advice appreciated! For reference, my oracle version is 10g, 10.2.0.5.0.

Edit: Accepted answer provided by @Dazzal as it answered my original query but I've posted a new question based on @Gordon Linoff's answer, as I think this will more fit my needs. https://stackoverflow.com/questions/13383560/grouping-data-oracle-sql-based-on-sum

Was it helpful?

Solution

if worker ids ordering is unimportant, use a built in aggregate function.

SQL> select id, min(start_date), max(case when last_worker is not null then end_date end) end_date,
  2         wm_concat(worker_id) worker_ids, max(last_worker) last_worker
  3    from (select id, start_date, end_date, end_code, worker_id, case when end_code ='J' or end_date is null then worker_id end last_worker,
  4                 nvl(max(r) over (partition by id order by end_date nulls last), 1) r
  5    from (select t.*,
  6                  case
  7                    when lag(end_code, 1) over (partition by id order by end_date nulls last) = 'J' then
  8                     row_number() over(partition by id order by end_date nulls last)
  9                  end r
 10             from test t)
 11  )
 12  group by id, r;

I MIN(START END_DATE  WORKER_IDS           LAST_WORKER
- --------- --------- -------------------- -----------
A 02-AUG-03 17-NOV-08 1,2                            2
A 03-MAR-09           3,6,5,4                        6

is it is, you'll have to make your own aggregate function. eg here's one i use on 10g:

SQL> select id, min(start_date), max(case when last_worker is not null then end_date end) end_date,
  2         stragg_num(stragg_num_typ(worker_id, ',', worker_id)) worker_ids, max(last_worker) last_worker
  3    from (select id, start_date, end_date, end_code, worker_id, case when end_code ='J' or end_date is null then worker_id end last_worker,
  4                 nvl(max(r) over (partition by id order by end_date nulls last), 1) r
  5    from (select t.*,
  6                  case
  7                    when lag(end_code, 1) over (partition by id order by end_date nulls last) = 'J' then
  8                     row_number() over(partition by id order by end_date nulls last)
  9                  end r
 10             from test t)
 11  )
 12  group by id, r;

I MIN(START END_DATE  WORKER_IDS           LAST_WORKER
- --------- --------- -------------------- -----------
A 02-AUG-03 17-NOV-08 1,2                            2
A 03-MAR-09           3,4,5,6                        6

SQL>

the definition of which is:

drop function stragg;
drop function stragg_num;
drop type string_agg_type;
drop type stragg_vc_tab;
drop type stragg_vc_typ;
drop type stragg_num_tab;
drop type stragg_num_typ;
create or replace type stragg_vc_typ as object
(
  value   varchar2(4000),
  delim   varchar2(10),
  rown    varchar2(4000)
);
/
create or replace type stragg_vc_tab
as table of stragg_vc_typ;
/
show errors type stragg_vc_tab
create or replace type stragg_num_typ as object
(
  value   varchar2(4000),
  delim   varchar2(10),
  rown    integer
);
/
show errors type stragg_num_typ
create or replace type stragg_num_tab
as table of stragg_num_typ;
/
show errors type stragg_num_tab


create or replace type string_agg_type as object
(
   total clob,
   delim   varchar2(10),
   data    stragg_num_tab,
   data2    stragg_vc_tab,

   static function
        ODCIAggregateInitialize(sctx IN OUT string_agg_type )
        return number,

   member function
        ODCIAggregateIterate(self IN OUT string_agg_type ,
                             value IN stragg_num_typ )
        return number,

   member function
        ODCIAggregateIterate(self IN OUT string_agg_type ,
                             value IN stragg_vc_typ )
        return number,

   member function
        ODCIAggregateTerminate(self IN string_agg_type,
                               returnValue OUT  varchar2,
                               flags IN number)
        return number,

   member function
        ODCIAggregateMerge(self IN OUT string_agg_type,
                           ctx2 IN string_agg_type)
        return number
);
/
show errors type string_agg_type
create or replace type body string_agg_type
is


static function ODCIAggregateInitialize(sctx IN OUT string_agg_type)
return number
is
begin
    sctx := string_agg_type( null, null, null, null );
    return ODCIConst.Success;
end;

member function ODCIAggregateIterate(self IN OUT string_agg_type,
                                     value IN stragg_num_typ )
return number
is
begin
    if (delim is null)
    then
      delim := value.delim;
    end if;
    if (data is null)
    then
      data := stragg_num_tab();
    end if;
    data.extend;
    data(data.last) := value;
    return ODCIConst.Success;
end;

member function ODCIAggregateIterate(self IN OUT string_agg_type,
                                     value IN stragg_vc_typ )
return number
is
begin
    if (delim is null)
    then
      delim := value.delim;
    end if;
    if (data2 is null)
    then
      data2 := stragg_vc_tab();
    end if;
    data2.extend;
    data2(data2.last) := value;
    return ODCIConst.Success;
end;

member function ODCIAggregateTerminate(self IN string_agg_type,
                                       returnValue OUT varchar2,
                                       flags IN number)
return number
is
  v_delim  varchar2(10);
begin
    if data is not null 
    then
      for r_item in (select d.value
                       from table(data) d
                      order by d.rown)
      loop
        returnValue := returnValue || v_delim || r_item.value;
        v_delim := self.delim;
      end loop;
    else
      for r_item in (select d.value
                       from table(data2) d
                      order by d.rown)
      loop
        returnValue := returnValue || v_delim || r_item.value;
        v_delim := self.delim;
      end loop;
    end if;
    return ODCIConst.Success;
end;

member function ODCIAggregateMerge(self IN OUT string_agg_type,
                                   ctx2 IN string_agg_type)
return number
is
begin
    self.total := self.total || ctx2.total;
    return ODCIConst.Success;
end;


end;
/
show errors type body string_agg_type
CREATE or replace FUNCTION stragg_num(input stragg_num_typ )
RETURN varchar2
PARALLEL_ENABLE AGGREGATE USING string_agg_type;
/

CREATE or replace FUNCTION stragg(input stragg_vc_typ )
RETURN varchar2
PARALLEL_ENABLE AGGREGATE USING string_agg_type;
/

OTHER TIPS

You can do this with set operations. The following identifies periods where there is no gap -- that is, does not use end codes. It determines if the record starts a period using the lag function. It then does a cumulative sum to identify the members within a group, and then an aggregation to bring them together.

select groupnum, MIN(start) as start,
       (case when GroupSeqNum = NumInGroup then max(end) end) as end,
       listagg(WorkerId delimiter ',' order by start) as WorkerIds,
       (case when GroupSeqNum = NumInGroup then max(WorkerId) end) as EndWorker
from (select t.*,
             ROW_NUMBER() over (partition by id, groupnum order by start) as GroupSeqNum,
             COUNT(*) over (partition by id, groupnum) as NumInGroup     
      from (select t.*,
                   SUM(IsStart) over (partition by id order by start) as GroupNum
            from (select t.*,
                         (case when lag(end) over (partition by id order by start) = start - 1 then 0 else 1 end) as IsStart
                  from t
                 ) t
           ) t
    ) t

If you are identifying groups by ending with a "J" code, simply change the definition of IsStart to:

                     (case when lag(EndCode) over (partition by id order by start) = 'J' then 0 else 1 end) as IsStart
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top