Question

How would I be able to get N results for several groups in an oracle query.

For example, given the following table:

|--------+------------+------------|
| emp_id | name       | occupation |
|--------+------------+------------|
|      1 | John Smith | Accountant |
|      2 | Jane Doe   | Engineer   |
|      3 | Jack Black | Funnyman   |
|--------+------------+------------|

There are many more rows with more occupations. I would like to get three employees (lets say) from each occupation.

Is there a way to do this without using a subquery?

Was it helpful?

Solution

This produces what you want, and it uses no vendor-specific SQL features like TOP N or RANK().

SELECT MAX(e.name) AS name, MAX(e.occupation) AS occupation 
FROM emp e 
  LEFT OUTER JOIN emp e2 
    ON (e.occupation = e2.occupation AND e.emp_id <= e2.emp_id) 
GROUP BY e.emp_id 
HAVING COUNT(*) <= 3 
ORDER BY occupation;

In this example it gives the three employees with the lowest emp_id values per occupation. You can change the attribute used in the inequality comparison, to make it give the top employees by name, or whatever.

OTHER TIPS

I don't have an oracle instance handy right now so I have not tested this:

select *
from (select emp_id, name, occupation,
      rank() over ( partition by occupation order by emp_id) rank
      from employee)
where rank <= 3

Here is a link on how rank works: http://www.psoug.org/reference/rank.html

Add RowNum to rank :

select * from 
         (select emp_id, name, occupation,rank() over ( partition by occupation order by emp_id,RowNum) rank   
                      from employee) 
         where rank <= 3 

tested this in SQL Server (and it uses subquery)

select emp_id, name, occupation
from employees t1
where emp_id IN (select top 3 emp_id from employees t2 where t2.occupation = t1.occupation)

just do an ORDER by in the subquery to suit your needs

I'm not sure this is very efficient, but maybe a starting place?

select *
from people p1
    join people p2
        on p1.occupation = p2.occupation
    join people p3
        on p1.occupation = p3.occupation
        and p2.occupation = p3.occupation
where p1.emp_id != p2.emp_id
    and p1.emp_id != p3.emp_id

This should give you rows that contain 3 distinct employees all in the same occupation. Unfortunately, it will give you ALL combinations of those.

Can anyone pare this down please?

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top