Question

I am trying to use the PARTITION BY OVER to 'group' rows by certain columns. I understand the use of PARTITION somewhat, however I want to 'block' the partitions by date. For example, if we have

|col1|col2       |
| A  |01/JAN/2012|
| A  |01/FEB/2012|
| B  |01/MAR/2012|
| B  |01/APR/2012|
| A  |01/MAY/2012|

And I want to partition by col1 but I want the last A to be 'different' from the first two as it is separated date wise by the 'B' rows.

If I use;

SELECT ROW_NUMBER() OVER (PARTITION BY col1 ORDER BY col2) AS RNUM, a.* 
FROM table1 a;

It will yield;

|RNUM|col1|col2       |
|   1| A  |01/JAN/2012|
|   2| A  |01/FEB/2012|
|   3| A  |01/MAY/2012|
|   1| B  |01/MAR/2012|
|   2| B  |01/APR/2012|

but what I really want is;

|RNUM|col1|col2       |
|   1| A  |01/JAN/2012|
|   2| A  |01/FEB/2012|
|   1| B  |01/MAR/2012|
|   2| B  |01/APR/2012|
|   1| A  |01/MAY/2012|

Is this possible using PARTITION BY OVER? At the moment I have dropped back to using a cursor to parse over the data and assign a group id so I can separate the two sequences of 'A' but this is quite slow.

Thanks,

Mark.

Was it helpful?

Solution 2

First you should find GROUP_ID for each record to sort all similar COL1 to different GROUPS if they have a gap between. And then use this GROUP_ID in the OVER statement with COL1:

SQLFiddle demo

SELECT ROW_NUMBER() OVER (PARTITION BY Group_id,col1 ORDER BY col2) AS RNUM, a3.* 
FROM 
(
select a1.*,
      (select count(*) from t a2 where 
       a2.col1<>a1.col1 
       AND  
       a2.col2<a1.col2) as GROUP_ID
from t a1
) a3

order by col2

OTHER TIPS

this is possible with a couple of analytics:

select col1, col2, row_number() over (partition by grp order by col2) rnum
  from (select col1, col2, max(grp) over(order by col2) grp
          from (select col1, col2, 
                       case 
                         when lag(col1) over (order by col2) != col1
                         then
                           row_number() over (order by col2)
                         when row_number() over(order by col2) = 1 
                         then
                           1
                       end grp
                  from data));

i.e.:

first get the boundaries where col1 changes ordering by col2 date:

SQL> select col1, col2,
  2         case
  3           when lag(col1) over (order by col2) != col1
  4           then
  5             row_number() over (order by col2)
  6           when row_number() over(order by col2) = 1
  7           then
  8             1
  9         end grp
 10    from data;

C COL2             GRP
- --------- ----------
A 01-JAN-12          1
A 01-FEB-12
B 01-MAR-12          3
B 01-APR-12
A 01-MAY-12          5

we can then fill in those nulls:

SQL> select col1, col2, max(grp) over(order by col2) grp
  2    from (select col1, col2,
  3                  case
  4                    when lag(col1) over (order by col2) != col1
  5                    then
  6                      row_number() over (order by col2)
  7                    when row_number() over(order by col2) = 1
  8                    then
  9                      1
 10                  end grp
 11            from data);

C COL2             GRP
- --------- ----------
A 01-JAN-12          1
A 01-FEB-12          1
B 01-MAR-12          3
B 01-APR-12          3
A 01-MAY-12          5

then its a case of assigning row_number() by ordering by col2 and partitioning on grp

fiddle: http://sqlfiddle.com/#!4/4818c/1

See my approach below, it's simmilar to Dazzal's answer, a little different logic:

SQL FIDDLE

Step1:

--find the swhitches to new groups
select col1, col2, 
    case when nvl(lag(col1) over (order by col2),sysdate) <> col1 then 1 end as new_grp
  from data;

COL1    COL2        NEW_GRP
A   January, 01 2012    1
A   February, 01 2012   (null)
B   March, 01 2012      1
B   April, 01 2012      (null)
A   May, 01 2012        1

Step2:

--identify/mark the groups

select col1, col2, sum(new_grp) over (order by col2) as grp
from(
  select col1, col2, 
    case when nvl(lag(col1) over (order by col2),sysdate) <> col1 then 1 end as new_grp
  from data)
  ;

COL1    COL2        NEW_GRP
A   January, 01 2012    1
A   February, 01 2012   1
B   March, 01 2012      2
B   April, 01 2012      2
A   May, 01 2012        3

Step3:

--find the row_number within group
select col1, col2, row_number() over(partition by grp order by col2) rn
from(
  select col1, col2, sum(new_grp) over (order by col2) as grp
  from(
    select col1, col2, 
      case when nvl(lag(col1) over (order by col2),sysdate) <> col1 then 1 end as new_grp
    from data
      )
  );

COL1    COL2        NEW_GRP
A   January, 01 2012    1
A   February, 01 2012   2
B   March, 01 2012      1
B   April, 01 2012      2
A   May, 01 2012        1

You do not need partition. You need to convert your dates to DD/MM/YYYY format and order them. Or if you must, then you can partition by MM part, which gives you 01,02,03... and can be partitioned by and easily converted to number if needed. But you do not need all this... Do not complicate your queries. Always keep it simple. The outer query is only to re-format your dates back to DD/MON/YYYY format:

SELECT val, to_char(to_date(dt, 'DD/MM/YYYY'), 'DD/MON/YYYY') formatted_date 
  FROM
( -- Format your date to DD/MM/YYYY and order by it --
SELECT 'A' val, to_char(to_date('01/JAN/2012'), 'DD/MM/YYYY') dt FROM dual  
 UNION
SELECT 'A', to_char(to_date('01/FEB/2012'), 'DD/MM/YYYY') FROM dual  
 UNION
SELECT 'B',to_char(to_date('01/MAR/2012'), 'DD/MM/YYYY') FROM dual  
 UNION
SELECT 'B',to_char(to_date('01/APR/2012'), 'DD/MM/YYYY') FROM dual  
 UNION
SELECT 'A',to_char(to_date('01/MAY/2012'), 'DD/MM/YYYY') FROM dual  
ORDER BY 2
)
/

Your dates are ordered as you wanted then to:

VAL FORMATTED_DATE
-------------------
A   01/JAN/2012
A   01/FEB/2012
B   01/MAR/2012
B   01/APR/2012
A   01/MAY/2012
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top