Question

I am not a fluent english speaker, so it is hard to explain what I mean. I have a table which lists for each yearmonth for each team a value (nvarchar). Now I need to find for each yearmonth one of the teams that has the most occurences of that value. Look at the example below:

The table looks like this:

yearmonth (int) | team_id (int) | value (nvarchar)
201301 | 10 | 4325
201301 | 9  | 4325
201301 | 12 | 8325
201301 | 25 | 4325
201302 | 25 | 8345
201302 | 10 | 2544
201302 | 35 | 2544
201305 | 10 | 3689
201305 | 7  | 8654
201305 | 8  | 8654

What I now need is:

yearmonth | team_id
201301 | 9
201302 | 10
201305 | 7

.

201301 = 9 (or 10 or 25 because 4325 occurs most often, I don't care which team I get
201302 = 10 (or 35) because 2544 occurs  most often, I don't care which team I get)
201305 = 7 (or 8) because 8654 occurs  most often, I don't care which team I get

I have been looking at rank(), dense_rank(), different grouping and sorting options but they all help me to find the value that occurs most. I need to find a teamid that goes with the value that occurs the most... And that is where I get lost..... Somehow my gut tells me it should not be that hard but I've been headbutting myself for three days on this :s

Was it helpful?

Solution

You can use ROW_NUMBER ordering by COUNT for this greatest n per group query.

WITH t
     AS (SELECT yearmonth,
                value,
                min(team_id) AS team_id,
                ROW_NUMBER() OVER (PARTITION BY yearmonth 
                                      ORDER BY COUNT(*) DESC) AS RN
         FROM   YourTable
         GROUP  BY yearmonth,
                   value)
SELECT yearmonth,
       team_id
FROM   T
WHERE  RN = 1 

SQL Fiddle

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top