質問

using SQl2008 R2. I have a large set of records (100k) that are sorted by a specific order (I used ID in my sample data). I must maintain this sequence

I want to increment the group whenever a set fields change. I thought it would be easy with window functions, but I'm stuck

My data is

ID  Letter  Number
1      A       1
2      B       2
3      B       2
4      B       2
5      A       1

Result required is

ID  Letter  Number  GroupID
1      A       1       1
2      B       2       2
3      B       2       2
4      B       2       2
5      A       1       3

Basically when the combination of Col1 and Col2 changes WHEN SORTED, I want to increment the groupno. i.e. Even though row 1 and 5 both contain A 1, they must have unique GroupNos because they are interupted by 2,3,4

(My real data contains several fields that need to be compared)

 -- Create tables to work with / Source and Destination
CREATE TABLE #Items
    (
     ID INT 
    ,Letter VARCHAR(1)
    ,Number INT
    )

CREATE TABLE #Results
    (
     ID INT 
    ,Letter VARCHAR(1)
    ,Number INT
    ,GroupID int
    )


INSERT INTO #Items
        ( ID,Letter, Number )
SELECT 1, 'A', 1
UNION
SELECT 2, 'B', 2
UNION
SELECT 3,'B', 2
UNION
SELECT 4, 'B', 2
UNION
SELECT 5, 'A', 1

SELECT * FROM #Items
ORDER BY ID

INSERT INTO #Results
        ( ID,Letter, Number, GroupID )
SELECT 1, 'A', 1, 1
UNION
SELECT 2, 'B', 2,2
UNION
SELECT 3,'B', 2,2
UNION
SELECT 4, 'B', 2,2
UNION
SELECT 5, 'A', 1,3

Thanks in advance

Mark

役に立ちましたか?

解決

An easy way to solve this problem is as the difference of two row numbers. This uniquely identifies each sequence of values. Then use dense_rank(). Here is the query:

select ID, Letter, Number,
       dense_rank() over (order by diff, letter) as GroupId
from (select i.*,
             (row_number() over (order by id) -
              row_number() over (partition by letter order by id)
             ) as diff
      from #items i
     ) t

Given your data, here is what the row numbers, differences, and final value look like:

ID  Letter  Number   RN-ID  RN-Letter  Diff   GroupId
1      A       1       1       1         0       1
2      B       2       2       1         1       2
3      B       2       3       2         1       2
4      B       2       4       3         1       2
5      A       1       5       2         3       3

Voila! Almost like magic.

Here is the example on SQL Fiddle.

他のヒント

You can use a common table expression with a self join to emulate SQL Server 2012's LAG and detect the changes. The reason for the cte is to make sure that the row numbers are consecutive since id may not be;

WITH cte AS ( SELECT *, ROW_NUMBER() OVER (ORDER BY id) rn FROM #items )
SELECT a.id, a.letter, a.number, 
  1 + SUM(CASE WHEN a.letter<>b.letter OR a.number<>b.number THEN 1 ELSE 0 END) 
      OVER (ORDER BY a.id) groupid
FROM cte a LEFT JOIN cte b ON a.rn = b.rn + 1
ORDER BY a.id

SQLfiddle is a bit tired right now, so can't add one. This is tested on SQL Server 2012, but should work on 2008R2.

ライセンス: CC-BY-SA帰属
所属していません StackOverflow
scroll top