Question

Class| Value
-------------
A    | 1
A    | 2
A    | 3
A    | 10
B    | 1

I am not sure whether it is practical to achieve this using SQL. If the difference of values are less than 5 (or x), then group the rows (of course with the same Class)

Expected result

Class| ValueMin | ValueMax
---------------------------
A    | 1     |   3
A    | 10    |   10
B    | 1     |   1

For fixed intervals, we can easily use "GROUP BY". But now the grouping is based on nearby row's value. So if the values are consecutive or very close, they will be "chained together".

Thank you very much

Assuming MSSQL

Was it helpful?

Solution 3

These give the correct result, using the fact that you must have the same number of group starts as ends and that they will both be in ascending order.

if object_id('tempdb..#temp') is not null drop table #temp

create table #temp (class char(1),Value int);

insert into #temp values ('A',1);
insert into #temp values ('A',2);
insert into #temp values ('A',3);
insert into #temp values ('A',10);
insert into #temp values ('A',13);
insert into #temp values ('A',14);
insert into #temp values ('b',7);
insert into #temp values ('b',8);
insert into #temp values ('b',9);
insert into #temp values ('b',12);
insert into #temp values ('b',22);
insert into #temp values ('b',26);
insert into #temp values ('b',67);

Method 1 Using CTE and row offsets

with cte as
(select  distinct class,value,ROW_NUMBER() over ( partition by class order by value ) as R from #temp),
cte2 as
(
    select 
        c1.class
        ,c1.value
        ,c2.R as PreviousRec
        ,c3.r as NextRec
    from 
        cte c1
        left join cte c2 on (c1.class = c2.class and c1.R= c2.R+1 and c1.Value < c2.value + 5)
        left join cte c3 on (c1.class = c3.class and c1.R= c3.R-1 and c1.Value > c3.value - 5)
)

select
    Starts.Class
    ,Starts.Value as StartValue
    ,Ends.Value as EndValue
from
    (
     select 
        class
        ,value
        ,row_number() over ( partition by class order by value ) as GroupNumber
    from cte2
        where PreviousRec is null) as Starts join
    (
     select 
        class
        ,value
        ,row_number() over ( partition by class order by value ) as GroupNumber
    from cte2
        where NextRec is null) as Ends on starts.class=ends.class and starts.GroupNumber = ends.GroupNumber

** Method 2 Inline views using not exists **

select
        Starts.Class
        ,Starts.Value as StartValue
        ,Ends.Value as EndValue            
from
    (
        select class,Value ,row_number() over ( partition by class order by value ) as GroupNumber
        from
            (select distinct class,value from #temp) as T
        where not exists (select 1 from #temp where class=t.class and Value < t.Value and Value > t.Value -5 )
    ) Starts join
    (
        select class,Value ,row_number() over ( partition by class order by value ) as GroupNumber
        from
            (select distinct class,value from #temp) as T
        where not exists (select 1 from #temp where class=t.class and Value > t.Value and Value < t.Value +5 )
    ) ends on starts.class=ends.class and starts.GroupNumber = ends.GroupNumber

In both methods I use a select distinct to begin because if you have a dulpicate entry at a group start or end things go awry without it.

OTHER TIPS

You are trying to group things by gaps between values. The easiest way to do this is to use the lag() function to find the gaps:

select class, min(value) as minvalue, max(value) as maxvalue
from (select class, value,
             sum(IsNewGroup) over (partition by class order by value) as GroupId
      from (select class, value,
                   (case when lag(value) over (partition by class order by value) > value - 5
                         then 0 else 1
                    end) as IsNewGroup
            from t
           ) t
     ) t
group by class, groupid;

Note that this assumes SQL Server 2012 for the use of lag() and cumulative sum.

Update: *This answer is incorrect*

Assuming the table you gave is called sd_test, the following query will give you the output you are expecting

In short, we need a way to find what was the value on the previous row. This is determined using a join on row ids. Then create a group to see if the difference is less than 5. and then it is just regular 'Group By'.

If your version of SQL Server supports windowing functions with partitioning the code would be much more readable.

SELECT 
A.CLASS
,MIN(A.VALUE) AS MIN_VALUE
,MAX(A.VALUE) AS MAX_VALUE
FROM
     (SELECT 
      ROW_NUMBER()OVER(PARTITION BY CLASS ORDER BY VALUE) AS ROW_ID
      ,CLASS
      ,VALUE
      FROM SD_TEST) AS A
LEFT JOIN 
     (SELECT 
       ROW_NUMBER()OVER(PARTITION BY CLASS ORDER BY VALUE) AS ROW_ID
      ,CLASS
      ,VALUE
     FROM SD_TEST) AS B
ON A.CLASS = B.CLASS AND A.ROW_ID=B.ROW_ID+1
GROUP BY A.CLASS,CASE WHEN ABS(COALESCE(B.VALUE,0)-A.VALUE)<5 THEN 1 ELSE 0 END
ORDER BY A.CLASS,cASE WHEN ABS(COALESCE(B.VALUE,0)-A.VALUE)<5 THEN 1 ELSE 0 END DESC

ps: I think the above is ANSI compliant. So should run in most SQL variants. Someone can correct me if it is not.

Here is one way of getting the information you are after:

SELECT Under5.Class,
  (
    SELECT MIN(m2.Value) 
    FROM MyTable AS m2 
    WHERE m2.Value < 5 
      AND m2.Class = Under5.Class
  ) AS ValueMin,
  (
    SELECT MAX(m3.Value) 
    FROM MyTable AS m3 
    WHERE m3.Value < 5 
      AND m3.Class = Under5.Class
  ) AS ValueMax
FROM 
(
  SELECT DISTINCT m1.Class
  FROM MyTable AS m1 
  WHERE m1.Value < 5
) AS Under5
UNION
SELECT Over4.Class,
  (
    SELECT MIN(m4.Value) 
    FROM MyTable AS m4 
    WHERE m4.Value >= 5 
      AND m4.Class = Over4.Class
  ) AS ValueMin,
  (
    SELECT Max(m5.Value) 
    FROM MyTable AS m5 
    WHERE m5.Value >= 5 
      AND m5.Class = Over4.Class
  ) AS ValueMax
FROM 
(
  SELECT DISTINCT m6.Class
  FROM MyTable AS m6 
  WHERE m6.Value >= 5
) AS Over4
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top