Question

I'm working a query and I wasn't sure if my current approach is the "best" approach. I got a table filled with values and a table with buckets that I use to group those values.

There are 2 types of buckets:

  • A priority bucket, if items apply to the conditions they should fall in this bucket and be excluded for any following bucket.
  • A simple range bucket, the values should fall within a range.

I set up a simplified version that can be used to test. the [Filter] column in #tempBUcket here is used for the priority buckets

The table definitions

create table #TempData([Id] int, [value] int);
create table #TempBucket([Id] int, [FromVal] int,[ToVal] int,[Filter] int);

Some test data

INSERT INTO #TempData (Id,value)
    VALUES  (1,1),(2,2),(3,3),(4,4),(5,5),
            (6,6),(7,7),(8,8),(9,9),(10,1),
            (11,2),(12,3),(13,4),(14,5),(15,6),
            (16,7),(17,8),(18,9),(19,10),(20,11);
INSERT INTO #TempBucket(Id,FromVal,ToVal,Filter)
    VALUES  (1,0,3,0),(2,3,6,0),
            (3,6,9,0),(4,9,12,0),
            (5,NULL,NULL,1),(6,NULL,NULL,2);

Right now I have this query which works and gives the results I want but I feel this could be done better and more efficiently.

WITH filteredData as 
    (SELECT tv.*,COALESCE(tb1.Id,tb2.Id) as BucketId FROM #TempData tv
    LEFT JOIN #TempBucket tb1 --prio bucket join
        ON tb1.Filter IS NOT NULL AND tb1.Filter = tv.Value
    LEFT JOIN #TempBucket tb2 --range bucket join
        ON tb2.Fromval < tv.value AND tv.value <= tb2.ToVal
    )
SELECT Count(fd.value) as count ,BucketId FROM filteredData fd GROUP BY BucketId;

dbfiddle.uk

Was it helpful?

Solution

My solution does only 1 scan of #TempBucket but it adds an additional sort, I'm not sure it's more efficient, you should test it on volumes of data:

;with cte as
(
select t.*, tb.id as BucketId,
       row_number() over(partition by t.id order by tb.FromVal) as rn
from #TempData t
     join #TempBucket tb 
         on (t.value = tb.filter)  
         or (tb.Fromval < t.value AND t.value <= tb.ToVal)
)
select count(id) as cnt, BucketId
from cte
where rn = 1
group by BucketId;

To give a priority to the row in priority bucket it sorts by tb.FromVal which is null in case of priority bucket, alternatively you can sort on tb.Filter desc, I just don't know the exact criteria of priority bucket whether the FromVal,ToVal are null or Filter = 0

Licensed under: CC-BY-SA with attribution
Not affiliated with dba.stackexchange
scroll top