Question

I have this table with this data

DECLARE @tbl TABLE
(
    IDX INTEGER,
    VAL VARCHAR(50)
)

--Inserted values for testing
INSERT INTO @tbl(IDX, VAL) VALUES(1,'A')
INSERT INTO @tbl(IDX, VAL) VALUES(2,'A')
INSERT INTO @tbl(IDX, VAL) VALUES(3,'A')
INSERT INTO @tbl(IDX, VAL) VALUES(4,'B')
INSERT INTO @tbl(IDX, VAL) VALUES(5,'B')
INSERT INTO @tbl(IDX, VAL) VALUES(6,'B')
INSERT INTO @tbl(IDX, VAL) VALUES(7,'A')
INSERT INTO @tbl(IDX, VAL) VALUES(8,'A')
INSERT INTO @tbl(IDX, VAL) VALUES(9,'A')
INSERT INTO @tbl(IDX, VAL) VALUES(10,'C')
INSERT INTO @tbl(IDX, VAL) VALUES(11,'C')
INSERT INTO @tbl(IDX, VAL) VALUES(12,'A')
INSERT INTO @tbl(IDX, VAL) VALUES(13,'A')
--INSERT INTO @tbl(IDX, VAL) VALUES(14,'A')  -- this line has bad binary code
INSERT INTO @tbl(IDX, VAL) VALUES(14,'A')    -- replace with this line and it works
INSERT INTO @tbl(IDX, VAL) VALUES(15,'D')
INSERT INTO @tbl(IDX, VAL) VALUES(16,'D')
Select * From @tbl  -- to see what you have inserted...

And the Output I'm looking for is the FIRST and LAST Idx and Val in each group of Val's prior ordering over Idx. Noting that Val's may repeat !!! also Idx may not be in ascending order in the table as they are in the imsert statments. No cursors please ! i.e

Val  First   Last
=================
A        1      3
B        4      6
A        7      9
C       10     11
A       12     14
D       15     16
Was it helpful?

Solution

If the idx values are guaranteed to be sequential, then try this:

Select f.val, f.idx first, l.idx last
From @tbl f
   join @tbl l
      on l.val = f.val 
          and l.idx > f.idx 
          and not exists 
              (Select * from @tbl
               Where val = f.val 
                  and idx = l.idx + 1)
          and not exists  
              (Select * from @tbl
               Where val = f.val 
                  and idx = f.idx - 1)
          and not exists
              (Select * from @tbl
               Where val <> f.val
                 and idx Between f.idx and l.idx)
order by f.idx

if the idx values are not sequential, then it needs to be a bit more complex...

Select f.val, f.idx first, l.idx last
From @tbl f
   join @tbl l
      on l.val = f.val 
          and l.idx > f.idx 
          and not exists 
              (Select * from @tbl
               Where val = f.val 
                  and idx = (select Min(idx)
                             from @tbl 
                             where idx > l.idx))
          and not exists  
              (Select * from @tbl
               Where val = f.val 
                  and idx = (select Max(idx)
                             from @tbl 
                             where idx < f.idx))
          and not exists
              (Select * from @tbl
               Where val <> f.val
                 and idx Between f.idx and l.idx)
order by f.idx

OTHER TIPS

SQL Server 2012

In SQL Server 2012, you can use cte sequence with lag/lead analytical functions like below (fiddle here). The code does not assume any type or sequence about idx, and queries first and last occurrence of val within each window.

;with cte as
(
select val, idx,  
ROW_NUMBER() over(order by (select 0)) as urn  --row_number without ordering
from @tbl),
cte1 as
(
select urn, val, idx, 
lag(val, 1) over(order by urn) as prevval, 
lead(val, 1) over(order by urn) as nextval 
from cte
),
cte2 as
(
select val, idx, ROW_NUMBER() over(order by (select 0)) as orn,
(ROW_NUMBER() over(order by (select 0))+1)/2 as prn from cte1
where (prevval <> nextval or prevval is null or nextval is null)
),
cte3 as
(
select val, FIRST_VALUE(idx) over(partition by prn order by prn) as firstidx,
LAST_VALUE(idx) over(partition by prn order by prn) as lastidx, orn
from cte2
),
cte4 as
(
select val, firstidx, lastidx, min(orn) as rn
from cte3
group by val, firstidx, lastidx
)
select val, firstidx, lastidx
from cte4
order by rn;

SQL Server 2008

In SQL Server 2008, it is bit more tortured code due to the lack of lag/lead analytical functions. (fiddle here). Here also, the code does not assume any type or sequence about idx, and queries first and last occurrence of val within each window.

;with cte as
(
select val, idx,  ROW_NUMBER() over(order by (select 0)) as urn
from @tbl),
cte1 as
(
select m.urn, m.val, m.idx,
_lag.val as prevval, _lead.val as nextval
from cte as m
left join cte as _lag
on _lag.urn = m.urn-1
left join cte AS _lead
on _lead.urn = m.urn+1),
cte2 as
(
select val, idx, ROW_NUMBER() over(order by (select 0)) as orn,
(ROW_NUMBER() over(order by (select 0))+1)/2 as prn from cte1
where (prevval <> nextval or prevval is null or nextval is null)),
cte3 as
( select *, ROW_NUMBER() over(partition by prn order by orn) as rownum
from cte2), 
cte4 as 
(select o.val, (select i.idx from cte3 as i where i.rownum = 1 and i.prn = o.prn) 
as firstidx,
(select i.idx from cte3 as i where i.rownum = 2 and i.prn = o.prn) as lastidx, 
o.orn from cte3 as o), 
cte5 as ( 
select val, firstidx, lastidx, min(orn) as rn
from cte4
group by val, firstidx, lastidx
)
select val, firstidx, lastidx
from cte5
order by rn;

Note: Both of the solutions are based on the assumption that the database engine preserves the order of insertion, though relational database does not guaranteed the order in theory.

A way to do it - at least for SQL Server 2008 without using special functionality would be to introduce a helper table and helper variable.

Now whether that's actually possible for you as is (due to many other requirements) I don't know - but it might lead you on a solution path, but it does look to solve your current set up requirements of no cursor and nor lead/lag:

So basically what I do is make a helper table and a helper grouping variable: (sorry about the naming)

DECLARE @grp TABLE
    (
      idx INTEGER ,
      val VARCHAR(50) ,
      gidx INT
    )

DECLARE @gidx INT = 1

INSERT  INTO @grp
        ( idx, val, gidx )
        SELECT  idx ,
                val ,
                0
        FROM    @tbl AS t

I populate this with the values from your source table @tbl.

Then I do an update trick to assign a value to gidx based on when VAL changes value:

UPDATE  g
SET     @gidx = gidx = CASE WHEN val <> ISNULL(( SELECT val
                                                 FROM   @grp AS g2
                                                 WHERE  g2.idx = g.idx - 1
                                               ), val) THEN @gidx + 1
                            ELSE @gidx
                       END
FROM    @grp AS g

What this does is assign a value of 1 to gidx until VAL changes, then it assigns gidx + 1 which is also assigned to @gixd variable. And so on. This gives you the following usable result:

idx val  gidx
1   A   1
2   A   1
3   A   1
4   B   2
5   B   2
6   B   2
7   A   3
8   A   3
9   A   3
10  C   4
11  C   4
12  A   5
13  A   5
14  A   5
15  D   6
16  D   6

Notice that gidx now is a grouping factor.

Then it's a simple matter of extracting the data with a sub select:

SELECT  ( SELECT TOP 1
                    VAL
          FROM      @GRP g3
          WHERE     g2.gidx = g3.gidx
        ) AS Val ,
        MIN(idx) AS First ,
        MAX(idx) AS Last
FROM    @grp AS g2
GROUP BY gidx 

This yields the result:

A   1   3
B   4   6
A   7   9
C   10  11
A   12  14
D   15  16

Fiddler link

I'm assuming that IDX values are unique. If they can also be assumed to start from 1 and have no gaps, as in your example, you could try the following SQL Server 2005+ solution:

WITH partitioned AS (
  SELECT
    IDX, Val,
    grp = IDX - ROW_NUMBER() OVER (PARTITION BY Val ORDER BY IDX ASC)
  FROM @tbl
)
SELECT
  Val,
  FirstIDX = MIN(IDX),
  LastIDX  = MAX(IDX)
FROM partitioned
GROUP BY
  Val, grp
ORDER BY
  FirstIDX
;

If IDX values may have gaps and/or may start from a value other than 1, you could use the following modification of the above instead:

WITH partitioned AS (
  SELECT
    IDX, Val,
    grp = ROW_NUMBER() OVER (                 ORDER BY IDX ASC)
        - ROW_NUMBER() OVER (PARTITION BY Val ORDER BY IDX ASC)
  FROM @tbl
)
SELECT
  Val,
  FirstIDX = MIN(IDX),
  LastIDX  = MAX(IDX)
FROM partitioned
GROUP BY
  Val, grp
ORDER BY
  FirstIDX
;

Note: If you end up using either of these queries, please make sure the statement preceding the query is delimited with a semicolon, particularly if you are using SQL Server 2008 or later version.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top