Excluding same day drop\adds while preserving the real start and end date
-
28-04-2021 - |
Question
We have a table with status updates for subscriptions to a product. A record is inserted into the table when the subscription begins, and that record is updated with an end date when the subscription ends. One of our systems (no idea which one) sometimes does a "same day drop\add" where it ends the subscription and then begins it again (creating a new record). So the same subscriber ID is attached to multiple records, even though nothing really changed.
Example data would be this:
recID subID start end prodtype
1 19 01/11/2001 01/15/2001 A
2 19 01/15/2001 01/16/2001 A
3 19 01/16/2001 01/20/2001 A
4 19 01/30/2001 01/31/2001 A
This guy started on 1/11 and ended on 1/20. Records 2 and 3 were put in by the system (same day drop add, but weren't really). Record 4 is another subscription Mr. 19 started later.
I have some code that will attempt to resolve only the first (the real) record of each distinct subscription, but it can't find the real end date without using max() and grouping by the subscriber. That of course would show two subscriptions, 1/11 - 1/31 and 1/30 - 1/31, which is wrong.
I'm tearing my hair out trying to resolve this pattern down to two records like this:
subID start end prodtype
19 01/11/2001 01/20/2001 A
19 01/30/2001 01/31/2001 A
This is in Teradata, but its just ANSI SQL, I believe.
Solution
I believe this is ANSI SQL, but I've only tested it on SQL Server.
Basically, the query is able to find true start dates and true end dates independently of each other. Then to associate the start date and end dates, associates start dates with end dates that are greater than the start date... and then shows the smallest end date.
SELECT
startDates.subId,
startDates.startDate,
MIN(endDates.endDate) AS endDate,
startDates.prodType
FROM
(
SELECT
recID, subID, startDate, prodType
FROM yourTable s1
WHERE NOT EXISTS (
SELECT 1
FROM yourTable s2
WHERE
s1.startDate = s2.endDate
AND s1.subId = s2.subId
)
) startDates JOIN
(
SELECT
recID, subID, endDate, prodType
FROM yourTable s1
WHERE NOT EXISTS (
SELECT 1
FROM yourTable s2
WHERE
s1.endDate = s2.startDate
AND s1.subId = s2.subId
)
) endDates ON
startDates.subID = endDates.subID
AND startDates.startDate < endDates.endDate
GROUP BY
startDates.subId,
startDates.startDate,
startDates.prodType
OTHER TIPS
You can find all records with actual end dates with code like this:
select t1.*
from myTable t1 left outer join myTable t2 on
t1.SubID = t2.SubID and
t1.end = t2.start and t2.start is null
You can find start records in an analogous way, of course. Then maybe you can patch them together.
That said, there are times to give up on doing all the processing the select statement, and use a stored proc, or bring all the data back to the client and process there.