Count Total time difference between selected row
-
14-02-2021 - |
Question
My data contains an ID, status and time(hh:mm:ss).
I need to calculate the total time between every 'Location' and 'Out of tunnel' status
Current table
ID || time || Status
________________________________________________
1 || 08:09:14 || Out of Tunnel
2 || 08:10:59 || Location
3 || 08:11:42 || Out of Tunnel
4 || 08:11:55 || Location
5 || 08:16:36 || Location
6 || 09:41:36 || Location
7 || 09:43:10 || Out of Tunnel
8 || 09:43:19 || Location
The 'Location' status marks where I need to start counting. I need the count to stop the next time the status is 'Out Of Tunnel'. If there is no corresponding 'Out of Tunnel' given, then don't count it.
Each total would then be summed to give a grand total. For example:
- Start at ID2: 08:10:59
- End at ID3: 08:11:42
- Total 1: 00:00:43
- Start at ID4: 08:11:42
- End at ID7: 09:43:10
- Total 2: 01:31:28
- Grand Total: Total1 + Total2 = 01:32:11
Desired output
______________________
| Total Time in Tunnel |
|______________________|
| |
| 01:32:11 |
|______________________|
Solution
The second round starting with ID 4 has the time 08:11:55 in your sample data, so if indeed that's the time you're supposed to use as the starting point, the total should amount to 01:31:58. At any rate, here's a solution using the LAG and LEAD window functions. If you want to prevent a sort in the plan, make sure you create the following supporting index:
create unique index idx_time_status on dbo.t1(time, status);
If you're running on 2016+, you can enable batch processing by creating the following dummy index:
create nonclustered columnstore index idx_cs on dbo.t1(id) where id = -1 and id = -2;
Prior to SQL Server 2019, SQL Server would not consider using batch processing if there's no columnstore index present on at least one of the participating tables in the query. Creating this dummy index, even if it's really meaningless in its own right, enables the use of the more optimal batch processing for window functions. Check out the plans both with and without the index. I explain this in detail here.
Here's the solution code:
with c1 as
(
select *,
case
when status = 'location'
and lag(status) over(order by time) = 'location' then 'no'
else 'yes'
end as keeper
from dbo.t1
where status in ('location', 'out of tunnel')
),
c2 as
(
select *, lead(time) over(order by time) as nxt
from c1
where keeper = 'yes'
)
select dateadd(second, sum(datediff(second, time, nxt)), cast('00:00:00' as time(0))) as total
from c2
where status = 'location';
I formatted the output as TIME assuming the total amount of time will be less than 24 hours. If it could be more, you'd just need to add a bit of formatting logic.
OTHER TIPS
WITH cte AS (
SELECT t2.time - MIN(t1.time) delta
FROM test t1
JOIN test t2 ON t1.time < t2.time
AND t2.status = 'Out of Tunnel'
LEFT JOIN test t3 ON t1.time < t3.time
AND t3.time < t2.time
AND t3.status = 'Out of Tunnel'
WHERE t1.status = 'Location'
AND t3.id IS NULL
GROUP BY t2.time
)
SELECT SUM(delta) "Total Time in Tunnel"
FROM cte
The only imperfection - сonvert time
field to comparable, substractable and summable data form/type (or use proper functions in proper places) by yourself.