Question

We have a table with status updates for subscriptions to a product. A record is inserted into the table when the subscription begins, and that record is updated with an end date when the subscription ends. One of our systems (no idea which one) sometimes does a "same day drop\add" where it ends the subscription and then begins it again (creating a new record). So the same subscriber ID is attached to multiple records, even though nothing really changed.

Example data would be this:

recID subID   start           end        prodtype
1     19    01/11/2001  01/15/2001    A
2     19    01/15/2001  01/16/2001    A
3     19    01/16/2001  01/20/2001    A
4     19    01/30/2001  01/31/2001    A

This guy started on 1/11 and ended on 1/20. Records 2 and 3 were put in by the system (same day drop add, but weren't really). Record 4 is another subscription Mr. 19 started later.

I have some code that will attempt to resolve only the first (the real) record of each distinct subscription, but it can't find the real end date without using max() and grouping by the subscriber. That of course would show two subscriptions, 1/11 - 1/31 and 1/30 - 1/31, which is wrong.

I'm tearing my hair out trying to resolve this pattern down to two records like this:

subID   start           end        prodtype
 19    01/11/2001   01/20/2001    A
 19    01/30/2001   01/31/2001    A

This is in Teradata, but its just ANSI SQL, I believe.

Was it helpful?

Solution

I believe this is ANSI SQL, but I've only tested it on SQL Server.

Basically, the query is able to find true start dates and true end dates independently of each other. Then to associate the start date and end dates, associates start dates with end dates that are greater than the start date... and then shows the smallest end date.

SELECT
    startDates.subId,
    startDates.startDate,
    MIN(endDates.endDate) AS endDate,
    startDates.prodType
FROM
(
    SELECT
        recID, subID, startDate, prodType
    FROM yourTable s1
    WHERE NOT EXISTS (
        SELECT 1
        FROM yourTable s2
        WHERE 
            s1.startDate = s2.endDate
            AND s1.subId = s2.subId
    )
) startDates JOIN
(
    SELECT
        recID, subID, endDate, prodType
    FROM yourTable s1
    WHERE NOT EXISTS (
        SELECT 1
        FROM yourTable s2
        WHERE 
            s1.endDate = s2.startDate
            AND s1.subId = s2.subId
    )
) endDates ON
    startDates.subID = endDates.subID 
    AND startDates.startDate < endDates.endDate
GROUP BY
    startDates.subId,
    startDates.startDate,
    startDates.prodType

Here is the query in action...

OTHER TIPS

You can find all records with actual end dates with code like this:

select t1.*
from myTable t1 left outer join myTable t2 on
t1.SubID = t2.SubID and  
t1.end = t2.start and t2.start is null

You can find start records in an analogous way, of course. Then maybe you can patch them together.

That said, there are times to give up on doing all the processing the select statement, and use a stored proc, or bring all the data back to the client and process there.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top