Question

I would like to take a simple query on a list of members that are indexed by a number and group them into 'buckets' of equal size. So the base query is:

select my_members.member_index from my_members where my_members.active=1;

Say I get 1000 member index numbers back, now I want to split them into 10 equally sized groups by a max and min member index. Something like:

Active members in 0 through 400 : 100 Active members in 401 through 577 : 100 ... Active members in 1584 through 1765 : 100

The best I could come up with is repeatedly querying for the max(my_members.member_index) with an increasing rownum limit:

  for r in 1 .. 10 loop
  select max(my_members.member_index)
  into ranges(r)
  from my_members
   where  my_members.active = 1
   and rownum < top_row
   order by my_members.member_index asc;
   top_row    := top_row + 100;
  end loop;
Was it helpful?

Solution

NTILE is the way to go - worth reading up on analytic functions as they can hugely simplify your SQL.

Small comment on the original code - doing a rownum restriction before an ORDER BY can produce adverse results

for r in 1 .. 10 loop
   select max(my_members.member_index)
   into ranges(r)
   from my_members
   where  my_members.active = 1
   and rownum < top_row
   order by my_members.member_index asc;
   top_row    := top_row + 100;

end loop;

Try the following :

create table example_nums (numval number)

begin
    for i in 1..100 loop
        insert into example_nums values (i);
   end loop;
end;

SELECT numval FROM example_nums 
WHERE rownum < 5 
ORDER BY numval DESC;

To get the result you expect you need to do

SELECT numval FROM
   (SELECT numval FROM example_nums 
   ORDER BY numval DESC)
WHERE rownum < 5 

(Note - behind the scenes, Oracle will translate this into an efficient sort that only ever holds the 'top 4 items').

OTHER TIPS

It's simple and much faster using the NTILE analytic function:

SELECT member_index, NTILE(10) OVER (ORDER BY member_index) FROM my_members;

Oracle 10g documentation: "NTILE is an analytic function. It divides an ordered data set into a number of buckets indicated by expr and assigns the appropriate bucket number to each row. The buckets are numbered 1 through expr."

Thanks for the help. It took a while to work it all into one statement (for certain reasons that was also a goal), so here's what I came up with that looks like it works for me:

select max(member_index), ranger
  from (SELECT member_index,
                    CASE
                        WHEN rownum < sized THEN 1
                        WHEN rownum < sized*2 THEN 2
                        WHEN rownum < sized*3 THEN 3
                        WHEN rownum < sized*4 THEN 4
                        WHEN rownum < sized*5 THEN 5
                        WHEN rownum < sized*6 THEN 6
                        WHEN rownum < sized*7 THEN 7
                        WHEN rownum < sized*8 THEN 8
                        WHEN rownum < sized*9 THEN 9
                        ELSE 10
                     END ranger
             from my_members,
                    (select count(*) / 10 sized
                        from my_members
                      where active = 1)
            where active = 1
            order by member_index)
 group by ranger;

Give me my results like this:

member_index    ranger
2297683     1
2307055     2
2325667     3
2334819     4
2343982     5
2353325     6
2362247     7
6229146     8
8189767     9
26347329        10

Take a look at the CASE statement in SQL and set a group field based off the ranges you want.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top