SQL Server 2000 - Returning a single column value from a 4 column table based on the values of the other 3 columns

StackOverflow https://stackoverflow.com/questions/21307256

Question

I have a table as follows and I want to run a script to return a single PLAN_ID value for each CARE_ID and these are the conditions. How would I do this? I'm writing the script in SQL Server 2005 but it needs to be backwards compatible with SQL Server 2000.

  • If a CARE_ID has only one PLAN_ID then return that PLAN_ID
  • If a CARE_ID has multiple PLAN_IDs and the value of FIRST_TREATMENT_DATE is NULL then return the highest value PLAN_ID
  • If a CARE_ID has multiple PLAN_IDs and the value of FIRST_TREATMENT_DATE is not NULL then return the PLAN_ID which has the most recent date for N5_2_MDT_DATE

My calculations say that the script would return the PLAN_ID values of

1833,65,162,2929,67,93,44,1136,1046,1047,1048,1049,1050,1052,1870,2426

Thank you

    PLAN_ID CARE_ID N5_2_MDT_DATE       FIRST_TREATMENT_DATE
    1833    1       20/08/2011 00:00    NULL
    199     1       23/06/2010 00:00    NULL
    65      4       27/11/2009 00:00    NULL
    162     5       30/07/2010 00:00    NULL
    54      5       15/11/2009 00:00    NULL
    55      5       29/10/2009 00:00    NULL
    63      5       03/09/2009 00:00    NULL
    2929    9       29/01/2013 00:00    NULL
    99      9       08/03/2010 00:00    NULL
    95      9       04/02/2010 00:00    NULL
    64      9       18/11/2009 00:00    NULL
    67      106     14/01/2013 00:00    NULL
    96      106     20/07/2009 00:00    NULL
    93      107     23/02/2010 00:00    21/09/2012 00:00
    44      108     25/12/2009 00:00    NULL
    43      108     07/10/2009 00:00    NULL
    1136    364     18/02/2011 00:00    19/02/2011 00:00
    1122    364     26/01/2011 00:00    19/02/2011 00:00
    1046    1661    25/01/2011 00:00    25/01/2011 00:00
    1047    1662    25/01/2011 00:00    25/01/2011 00:00
    1048    1663    25/01/2011 00:00    01/02/2011 00:00
    1049    1665    25/01/2011 00:00    NULL
    1050    1666    23/01/2011 00:00    01/02/2011 00:00
    1052    1667    01/02/2011 00:00    01/02/2011 00:00
    1870    1781    04/10/2010 00:00    10/02/2011 00:00
    1869    1781    04/10/2010 00:00    10/02/2011 00:00
    1868    1781    04/10/2010 00:00    10/02/2011 00:00
    2426    2246    23/03/2012 00:00    01/07/2012 00:00
    2275    2246    01/01/2012 00:00    01/07/2012 00:00
    2170    2246    14/10/2011 00:00    01/07/2012 00:00
    1784    2246    04/08/2011 00:00    01/07/2012 00:00
    1940    2246    10/07/2011 00:00    01/07/2012 00:00
    1637    2246    20/06/2011 00:00    01/07/2012 00:00
    1539    2246    02/06/2011 00:00    01/07/2012 00:00
    1538    2246    01/06/2011 00:00    01/07/2012 00:00
    1536    2246    31/05/2011 00:00    01/07/2012 00:00
Was it helpful?

Solution

I don't have a 2000 instance handy, but I think I've kept to compatible bits of language. I'm assuming that for the third condition, there won't be two plans with the same N5_2_MDT_DATE.

I actually ran this as a single script with the data setup1 first and then the query, but am rearranging things so that the actual answer appears first:

select t1.CARE_ID,
    CASE
        --Cases one and two are identical, effectively
        WHEN COUNT(*) = 1 OR MAX(t1.FIRST_TREATMENT_DATE) IS NULL
                 THEN MAX(t1.PLAN_ID)
        ELSE MAX(CASE WHEN t1.N5_2_MDT_DATE = t2.LastDate THEN t1.PLAN_ID END)
    END
from @t t1
    inner join
    (select CARE_ID,MAX(N5_2_MDT_DATE) as LastDate
    from @t
    group by CARE_ID
    ) t2
        on t1.CARE_ID = t2.CARE_ID
group by t1.CARE_ID

You'll note that I've collapsed cases one and two since there's no harm in computing the highest plan number when there's only one plan to consider.

Also, we disagree on which plan to return for case 106, but I'm certain 96 is correct rather than 67, by the rules you've listed.

The logic is (mostly) all up in the CASE expression in the select. If there's only a single row in the group or MAX(FIRST_TREATMENT_DATE) across all rows is NULL (which can only happen if all rows in the group have NULL then we just return the MAX(PLAN_ID).

The more complex case is the third. To assist with this, I've got my subquery (t2) which finds the highest N5_2_MDT_DATE value for each CASE_ID. We then use this in the ELSE clause of the CASE expression, inside another aggregate - where we try to ensure that we only actually consider a particular PLAN_ID if its N5_2_MDT_DATE column matches the highest one found by t2 - which should only happen once in each group, if the assumption I outlined above in the first paragraph holds.

For later versions of SQL Server, CTEs and the ROW_NUMBER() function would make this substantially easier to write, I think.


1 Data setup of the table variable I used. This should appear first in the query window if you want to run the above query:

declare @t table (PLAN_ID int not null, CARE_ID int not null,
      N5_2_MDT_DATE datetime not null,FIRST_TREATMENT_DATE datetime null)
insert into @t(PLAN_ID,CARE_ID,N5_2_MDT_DATE,FIRST_TREATMENT_DATE)
SELECT 1833, 1    ,'20110820',NULL        union all
SELECT 199 , 1    ,'20100623',NULL        union all
SELECT 65  , 4    ,'20091127',NULL        union all
SELECT 162 , 5    ,'20100730',NULL        union all
SELECT 54  , 5    ,'20091115',NULL        union all
SELECT 55  , 5    ,'20091029',NULL        union all
SELECT 63  , 5    ,'20090903',NULL        union all
SELECT 2929, 9    ,'20130129',NULL        union all
SELECT 99  , 9    ,'20100308',NULL        union all
SELECT 95  , 9    ,'20100204',NULL        union all
SELECT 64  , 9    ,'20091118',NULL        union all
SELECT 67  , 106  ,'20130114',NULL        union all
SELECT 96  , 106  ,'20090720',NULL        union all
SELECT 93  , 107  ,'20100223','20120921'  union all
SELECT 44  , 108  ,'20091225',NULL        union all
SELECT 43  , 108  ,'20091007',NULL        union all
SELECT 1136, 364  ,'20110218','20110219'  union all
SELECT 1122, 364  ,'20110126','20110219'  union all
SELECT 1046, 1661 ,'20110125','20110125'  union all
SELECT 1047, 1662 ,'20110125','20110125'  union all
SELECT 1048, 1663 ,'20110125','20110201'  union all
SELECT 1049, 1665 ,'20110125',NULL        union all
SELECT 1050, 1666 ,'20110123','20110201'  union all
SELECT 1052, 1667 ,'20110201','20110201'  union all
SELECT 1870, 1781 ,'20101004','20110210'  union all
SELECT 1869, 1781 ,'20101004','20110210'  union all
SELECT 1868, 1781 ,'20101004','20110210'  union all
SELECT 2426, 2246 ,'20120323','20120701'  union all
SELECT 2275, 2246 ,'20120101','20120701'  union all
SELECT 2170, 2246 ,'20111014','20120701'  union all
SELECT 1784, 2246 ,'20110804','20120701'  union all
SELECT 1940, 2246 ,'20110710','20120701'  union all
SELECT 1637, 2246 ,'20110620','20120701'  union all
SELECT 1539, 2246 ,'20110602','20120701'  union all
SELECT 1538, 2246 ,'20110601','20120701'  union all
SELECT 1536, 2246 ,'20110531','20120701'

OTHER TIPS

I'm a bit late but...

select Care_ID, 
case when d.b is null then MAx(Plan_ID) else (select Distinct data.Plan_ID
                                              from 
                                                 (select max(N5_2_MDT_DATE) as N5_2_MDT_DATE
                                                  from data 
                                                  group by CARE_ID) a 
                                              inner join data on data.N5_2_MDT_DATE = a.N5_2_MDT_DATE
                                              Where d.Care_ID = data.Care_ID

 ) End as Plan_ID
from 
(
   select PLAN_ID, CARE_ID, Max(N5_2_MDT_DATE) as a, Max(FIRST_TREATMENT_DATE) as b
   from data
   group by   PLAN_ID, CARE_ID     
  ) d    
Group by CARE_ID, d.b

demo: http://sqlfiddle.com/#!3/834f6/53

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top